Data Management
Index
- Introduction
- Objectives
- 1.Loss of Data Due to Errors or Failures
- 2.Data Manager Architecture
- 2.1.The Buffer Pool and its Manager
- 2.1.1.Buffer Pool Structure
- 2.1.2.Buffer Manager Operations
- 2.2.Recovery Manager
- 2.3.Log
- 2.3.1.Log Content
- 2.3.2.Write-Ahead Logging Protocol
- 2.3.3.Log Implementation
- 2.3.4.Checkpointing
- 2.1.The Buffer Pool and its Manager
- 3.Logging Recovery
- 4.Media Recovery
- 5.Further Aspects in Distributed Databases
- Summary
- Self-evaluation
- Answer key
- Glossary
- Bibliography
Introduction
Objectives
-
Be aware of why the data stored in a database could be totally or partially lost.
-
Know which component of the DBMS is responsible for accesses to the physical database (i.e. non volatile storage) and how it relates to other components.
-
Understand the behaviour of the data manager and its subcomponents.
-
Be familiar with the log of a DBMS.
-
Learn how the behaviour of the data manager affects recovery mechanisms (logging and media recovery).
-
Distinguish between the different policies of logging recovery manager.
-
Learn to decide which recovery manager policy may be more appropriate in different circumstances.
-
Understand the reasons why failures occur in distributed databases.
-
Be familiar with the usual atomic commitment protocols in distributed databases.
1.Loss of Data Due to Errors or Failures
2.Data Manager Architecture
2.1.The Buffer Pool and its Manager
2.1.1.Buffer Pool Structure
2.1.2.Buffer Manager Operations
procedure flush(p: page_id) var w: page_value; if dirty(p) = 1 then w := content(p); write_page (p, w); dirty(p) := 0; endIf update_directory(p, 0); endProcedure
procedure fetch(p: page_id) var w: page_value; b: page_id; if search(0) = 0 then <choose a buffer slot b according to LRU or FIFO>; flush(b); endIf read_page (p, w); update_directory (0, p); content(p) := w; endProcedure
2.2.Recovery Manager
2.3.Log
2.3.1.Log Content
2.3.2.Write-Ahead Logging Protocol
2.3.3.Log Implementation
2.3.4.Checkpointing
3.Logging Recovery
3.1.Logging Recovery Manager Policies
3.2.Steal/No Force Policy
procedure read(t: transaction_id, p: page_id, v: page_value) if search(p) = 0 then fetch(p); endIf v := content(p); endProcedure
procedure write(t: transaction_id, p: page_id, v: page_value) var w: page_value; if search(p) = 0 then fetch(p); endIf w := content(p); write_log('u', t, p, w, v); content(p) := v; dirty(p) := 1; endProcedure
procedure commit(t: transaction_id) write_log('c', t); endProcedure
procedure abort(t: transaction_id) var p: page_id; w: page_value; read_log_backwards('u', t, p, w); while records_remain_in_log ('u', t) do if search(p) = 0 then fetch(p) endIf content(p) := w; dirty(p) := 1; read_log_backwards('u', t, p, w); endWhile write_log('a', t); endProcedure
b) write(T1,P1,8)
c) read(T3,P5,v3)
d) write(T3,P5,1)
e) read(T1,P2,v1)
f) read(T2,P3,v2)
g) write(T2,P3,29)
h) abort(T1)
i) read(T3,P8,v3)
j) write(T3,P8,17)
k) read(T2,P2,v2)
l) write(T2,P2,6)
m) read(T4,P20,v4)
n) write(T4,P20,50)
o) commit(T2)
p) auto_checkpointing
q) read(T5,P14,v5)
r) write(T5,P14,7)
s) read(T3,P3,v3)
t) write(T3,P3,18)
u) read(T4,P1,v4)
v) read(T6,P2,v6)
w) write(T6,P2,32)
x) commit(T6)
procedure restart() var CL: commit_trans_list; AL: abort_trans_list; p: page_id; w: before_image_page; v: after_image_page; t: transaction_id; r: entry_type; d: log_position; buffer_and_transaction_lists_reset (); undo_to_last_CP_phase (); undo_complementary_phase (); redo_phase (); enter_CP (); endProcedure
procedure buffer_and_transaction_lists_reset() for each buffer_slot(p) do dirty(p) := 0; pin(p) := 0; update_directory(p, 0); endFor CL := Ø; AL := Ø; endProcedure
procedure write_db(p: page_id, w: page_value) if search(p) = 0 then fetch(p) endIf content(p) := w; dirty(p) := 1; flush(p); endProcedure
procedure enter_CP() var ATL: active_transaction_list; ATL := Ø write_log('cp', ATL); endProcedure
3.3.No Steal/No Force Policy
3.4.Steal/Force Policy
3.5.No Steal/Force Policy
3.6.Logging Recovery Using the Shadowing Strategy
4.Media Recovery
5.Further Aspects in Distributed Databases
5.1.Failures in a Distributed System
-
Site failure.
-
Communication failure with possible network partitioning.
-
Message loss.
5.2.Atomic Commitment Protocols
5.2.1.Two-Phase Commit Protocol
5.2.2.Three-Phase Commit Protocol
5.3.Data Replication
5.4.Media Recovery
Summary
Self-evaluation
Answer key
Glossary
- buffer pool
- Volatile memory that uses the database management system to ensure an efficient access to the physical database (i.e. the persistent storage).
- checkpointing
- Method that forces the storage of a consistent state of the database and creates a new entry in the log.
- data manager
- Component of the database management system that ensures an efficient and correct access to the physical database (i.e. the persistent storage).
- DBMS
- Acronym of database management system
- fetch
- Operation performed by the buffer manager that reads a physical page that is loaded in a buffer slot.
- flush
- Operation performed by the buffer manager that takes a page from the buffer pool and stores it in the physical database if it has been modified.
- force policy
- Operating policy of the logging recovery manager in which, when a transaction commits, the logging recovery manager forces the flushing of the buffer slots assigned to the transaction.
- in-place updating
- Technique used when flushing buffer slots to the physical database that overwrites the involved pages, given that are always stored in the same place.
- log
- Sequential file that records the history of the transactions with the changes they made.
- logging recovery
- Term that covers all mechanisms and techniques for recovering a database, thereby ensuring compliance with the atomicity and durability properties of the transactions.
- media recovery
- Term that includes all mechanisms and techniques for rebuilding a database in the event of the total or partial loss of data due to hardware errors.
- mirror
- Secondary copy of a database used to facilitate reading and to rebuild the database in the event of the total or partial loss of data.
- no force policy
- Operating policy of the logging recovery manager that, when a transaction commits, does not force the flushing of the buffer slots assigned to the transaction.
- no steal policy
- Operating policy of the logging recovery manager in which the logging recovery manager does not allow the flushing of a buffer slot while the transaction using it remains active.
- recovery manager
- Subcomponent of the data manager in charge of implementing the recovery techniques.
- shadowing
- Technique used for flushing buffer slots to the physical database that searches for a new space every time a page has to be saved; it therefore keeps the previous page version.
- steal policy
- Operating policy of the logging recovery manager in which the logging recovery manager allows the flushing of a buffer slot even though the transaction using it remains active.
- Three-phase commit (3PC)
- Synchronisation protocol that ensures the global atomicity of distributed transactions while alleviating the blocking aspect of 2PC in the event of site failures. In other words, 3PC never requires operational sites to wait (i.e. block) until a failed site has recovered.
- Two-phase commit (2PC)
- Synchronisation protocol that solves the atomic commitment problem. It is used in distributed database systems to ensure the global atomicity of transactions in spite of site and communication failures, assuming that a failure will eventually be repaired and each site guarantees the atomicity of transactions at local level.
- Write-ahead logging protocol
- Protocol that specifies how and when to make log entries.