Cs186 Wiki
Advertisement

General Info[]

What[]

Crash recovery is the process of restoring the system to the correct state after something unexpected (such as a system crash) occurs using the write ahead log

Process[]

  1. Start at the most recent checkpoint
  2. Perform the Analysis phase
    1. From the checkpoint, scan forward and update the transactions table
    2. From the checkpoint, scan forward and update the dirty pages table
  3. Perform the Redo phas
  4. Perform the Undo phase

Analysis Phase[]

  1. Start with the transaction table and dirty pages table according to the most recent checkpoint
  2. Scan forward from the checkpoint
    • When a end record is reached, remove the transaction from the transaction table
    • For all other records, add the transaction to the transaction table, update the lastLSN on updates and undos, or update the status of the transaction on commit. Update the status of the transaction to aborting on aborts.
    • For update records, if the page that is being updated is not in the dirty pages table, add a record to the dirty pages table with the recLSN = LSN of the update

At the end of analysis:

  • The transaction table will have transactions that were active when the crash occurred
  • The dirty pages table will have pages that might have been dirty when the crash occurred

Redo Phase[]

In this phase, we use the information that we have gathered in the analysis phase to reconstruct the state at crash

  1. Start with the smallest recLSN in the dirty pages table (because each page in the dirty pages table could not have been written out to disk, therefore we must start at the smallest to cover every possibility)
  2. Reapply updates and CLRs unless:
    • The page that is being updated is not in the dirty pages table
    • Affected page is in D.P.T but recLSN > LSN. This means that the log record is earlier than the earliest dirty page LSN, which makes it irrevelant to the redo phase
    • pageLSN LSN. This means that the log record is earlier than the most recent update to that page.

To redo an action:

  • Reapply update
  • Set pageLSN to LSN. No forcing or logging is performed

Undo Phase[]

In this phase, we will be undoing all the transactions that were not completed when the crash occurred.

  1. Start with the lastLSN of all the transactions in the table
  2. Repeat until there are no more LSNs
    1. Choose and remove the largest LSN
    2. If the LSN points to CLR and the undoneNextLSN == Null, then write an end record for this transaction
    3. If the LSN points to CLR and the undoneNextLSN != Null, add the undoneNextLSN to the list of LSNs we must check
    4. If the LSN is an update, undo the update, write a CLR with undoNextLSN = prevLSN  and add prevLSN to the list of LSNs to undo. If prevLSN = null, then also write out an end record.
Advertisement