Transaction Models and Concurrency Control
Index
- Introduction
- Objectives
- 1.Interferences and Isolation Level
- 2.Centralized Concurrency Control
- 2.1.Advanced Locking
- 2.1.1.Read-for-update Locking
- 2.1.2.Multi-granule Locking
- 2.1.3.Deadlock Detection and Resolution
- 2.1.4.Tuning Locks
- 2.2.Timestamping
- 2.3.Multi-version
- 2.4.Comparative Analysis of Concurrency Control Mechanisms
- 2.5.Transaction Chopping
- 2.1.Advanced Locking
- 3.Transactions in Oracle
- 4.Distributed Concurrency Control
- 4.1.Distributed Locking
- 4.1.1.Deadlock Detection
- 4.2.Distributed Timestamping
- 4.1.Distributed Locking
- 5.Replica Management
- Summary
- Self-evaluation
- Glossary
- Bibliography
Introduction
Objectives
-
Remember the four kinds of interferences.
-
Remember the correctness criteria for concurrent access scheduling.
-
Analyze the benefits and drawbacks of some concurrency control mechanisms: locking, timestamping and multiversion.
-
Know some tuning mechanisms for locks.
-
Understand the concurrency control mechanisms of Oracle 10g.
-
Explain the difficulties of distributed concurrency control.
-
Understand the difficulties of locking in a distributed environment.
-
Understand the assignment of timestamps in a distributed environment.
-
Identify the difficulties in highly distributed systems: avoiding interferences, avoiding deadlock situations and dealing with replicas.
1.Interferences and Isolation Level
“The concurrency control manager synchronizes the concurrent access of database transactions to shared objects in the database. It will protect each transaction from anomalies (i.e., Lost update, Read uncommitted, Unrepeatable read, and Inconsistent analysis) that can result from the fact that other transactions are accessing the same data at the same time.”
Andreas Reuter, Encyclopedia of Database Systems
1.1.Schedules and Serializability
”Recovery ensures that failures are masked to the users of transaction-based data management systems by providing automatic treatment for different kinds of failures, such as transaction failures, system failures, media failures and disasters.”
Erhard Rahm, Encyclopedia of Database Systems
1.2.Kinds of Interferences
-
Lost update (also known as Write-Write) appears when the data written by a transaction is lost, because another one overwrites them before it commits. For example, T1 reads the balance of a bank account and gets the value 100; then T2 reads the same value from the same account; afterwards, T1 adds 25 € and writes the value; and finally T2 adds 50 € (to the value it read) and writes the value. The result of such execution is that the sum done by T1 has been lost (i.e., the final balance is 150 € instead of 175 €).
-
Read uncommitted (also known as Write-Read) typically appears when a transaction reads (and uses) the value written by another transaction and the one who wrote it, because of one reason or another, in the end, does not commit its results. For example, T1 attempts to get some money from the cash machine; then T2 (from another cash machine) reads the balance of the same account and sees that T1 got the money; and finally T1 cancels and does not get the money. The result is that T2 saw a balance that never existed.
-
Unrepeatable read (also known as Read-Write) appears when a transaction reads some data, another transaction overwrites those data, and the first one tries to read the same data again (it will find them changed). For example, T1 reads the balance of an account (which is 50 €); then T2 (from another cash machine) gets 25 € from the same account; finally, after considering the balance previously retrieved, T1 tries to get 50 €, which is surprisingly not possible now.
-
Inconsistent analysis (a special case of which are Phantoms) appears when the result of accessing several granules is affected by changes made by other transactions, so that it does neither reflect the state before, nor after the execution of those other transactions. For example, T1 lists the names of passengers in a flight; T2 inserts a new passenger in that flight; finally, T1 counts the number of passengers in the flight. As a result of such concurrent execution we get a phantom, i.e., a passenger that is counted and whose name is not in the list. It is important to notice that the difference between unrepeatable read and this is that the former affects one granule, while this involves more than one granule (i.e., one tuple is read and a different one is written).
1.3.Locking Basics
-
If G is not locked by any other transaction, it is locked for T and T continues its execution.
-
If G is locked by other transactions in a compatible mode, it is also locked for T and T continues its execution.
-
If G is locked by other transactions and one of them locked it in a mode incompatible with mode m, T is blocked and queued until all incompatible locks are released.
-
If there is not any queued transaction waiting, nothing has to be done.
-
If there are queued transactions, we follow the order in the queue to unlock as many transactions as possible according to the two situations bellow, until the first transaction in the queue cannot be dequeued:
-
If there is no other transaction holding the lock, the fist transaction in the queue takes the lock, is unblocked and dequeued.
-
If there are other transactions holding the lock, we have to check whether the mode of the first transaction in the queue is compatible with that of the current holders. If so, this also gets the lock and is unblocked and dequeued.
-
1.4.Isolation Levels
-
READ UNCOMMITTED: All we need to avoid lost update interferences is exclusive locks before writing granules. These locks must be released at the end of the transaction. To avoid this kind of interferences (i.e., Write-Write), it is unnecessary that transactions lock granules just to read them.
-
READ COMMITTED: Just locking for writing is not enough to avoid read uncommitted (i.e., Write-Read) interference. We need to check whether the granule we are going to read is in an uncommitted state or not. To do this, it is only necessary to check whether we can lock the granule in shared mode. We do not need to keep the lock until the end of the transaction, but just get the lock and immediately (at the end of the read operation) release it.
-
REPEATABLE READ: In order to avoid unrepeatable read (i.e., Read-Write) interferences, we have to keep the shared lock until the end of the transaction (i.e., we have to follow the Strict-2PL). By keeping the lock, we guarantee that if we have to read a granule twice, its value will have not changed from the first reading to the second. Actually, it would be enough to keep the lock until the second read operation. However, since the concurrency control manager does not know whether there is going to be a second reading or not, we have to keep the granule locked until the commit command is received.
-
SERIALIZABLE: Since inconsistent analysis interferences involve more than one granule, to be able to avoid them, the DBMS has to lock not only data, but also metadata in the catalog. Thus, in the example of phantoms we saw above, by locking the control information of the table, we would avoid the insertion of new tuples while we are scanning it.
2.Centralized Concurrency Control
2.1.Advanced Locking
2.1.1.Read-for-update Locking
-
Shared is used for reading the granule.
-
Update is used for immediate reading and future writing.
-
eXclusive is used for writing.
2.1.2.Multi-granule Locking
-
Shared is used for reading a granule.
-
eXclusive is used for writing a granule.
-
Intentional Shared over a granule is used for reading some of its subgranules.
-
Intentional eXclusive over a granule is used for writing some of its subgranules.
-
Shared Intentional eXclusive is used for reading all subgranules and writing some of them.
2.1.3.Deadlock Detection and Resolution
1: Copy the Wait-For graph 2: Do { 3: If there is a node without successors, then remove it 4: } While a node has been removed 5: If Wait-For graph is not empty then 6: Choose one node with predecessors (someone is waiting for) 7: Cancel the corresponding transaction
2.1.4.Tuning Locks
2.2.Timestamping
2.3.Multi-version
2.4.Comparative Analysis of Concurrency Control Mechanisms
2.5.Transaction Chopping
1) Generate a piece and place there all writings that may be rolled back. |
2) Generate a piece for each data access not in the first piece. |
3) Generate the CS-graph (drawing only C-edges) considering those pieces and all other transactions as a whole. |
4) Mark all pieces in the same connected component of that graph. |
5) Check the precedences between the resulting pieces and join them as necessary. |
3.Transactions in Oracle
4.Distributed Concurrency Control
“A distributed database is a collection of multiple, logically interrelated databases distributed over a computer network.”
Tamer Özsu and Patrick Valduriez. Principles of Distributed Database Systems.
4.1.Distributed Locking
4.1.1.Deadlock Detection
4.2.Distributed Timestamping
“A message always arrives after it was sent.”
Lamport clocks axiom
-
Every local action (i.e., read, write, commit, etc.) increases the clock (i.e., two actions cannot be done in less than one time unit).
-
Every external message sets the local clock C to max (C, TS (Ti).T +1), where T is the transaction generating the message.
TS(T1) |
TS(T2) |
Comparison |
||
---|---|---|---|---|
T |
N |
T |
N |
TS(T1) < TS(T2) |
4 |
1 |
5 |
2 |
true |
5 |
1 |
4 |
2 |
false |
4 |
1 |
4 |
2 |
true |
4 |
2 |
4 |
1 |
false |
4 |
1 |
4 |
1 |
impossible |
5.Replica Management
-
The left-top diagram shows that we have a primary copy with synchronously maintained replicas. In this case we can read from any copy, but have to modify the primary one (which can be a performance bottleneck). No inconsistencies can be generated, because changes are immediately propagated to the replicas before committing the operation. If the propagation is not possible (for example, because some node containing a copy is not reachable at this time), the operation is rejected by the system.
-
The right-top diagram shows that we have a primary copy with asynchronously maintained replicas. We can read from any copy, but have to modify the primary one. The advantage of this approach is that we can optimize propagation of changes (i.e., we can confirm the operation, without the changes being propagated to all copies). The propagation of changes to the replicas can wait for the appropriate time. However, the primary node bottleneck remains and in the interval between the modification of the primary copy and the propagation of it to all replicas, some clients could get inconsistent results (i.e., two users would get a different value if their queries are answered retrieving data from different nodes).
-
The left-bottom diagram shows that we do not have any primary copy, and all of them are synchronously modified. Now, modifications can be done at any replica (because all are actually modified at once) and no inconsistencies are obtained on reading, because all contain the same contents at any moment. Thus, users’ queries can be answered by accessing the closest replica. The problem is that no modification can be committed until all replicas confirm they did the changes.
-
Finally, right-bottom diagram shows that we do not have any primary copy and replicas are asynchronously maintained. Therefore, all bottlenecks disappear, because operations are always confirmed locally, but clients can get inconsistent results if two of them modify the same granule at the same time. When detected during the change propagation process, potential inconsistencies have to be reconcilied (i.e., some copy must prevail over the others).
“We can only achieve two of Consistency, system Availability, and tolerance to network Partition.”
CAP theorem, Eric Brewer
-
Strong consistency: After a modification has been committed, any user will read the same contents. Strong consistency can only be guaranteed if W + R > N. This means that the user will check data from at least one node containing the right value. Thus, if discrepancies are detected among the R replicas retrieved, more replicas are accessed until we can decide which one is correct. You should notice that by potentially accessing all nodes, we will always know that the value repeated W times is the right one.
-
Read One/Write All (ROWA): Users read any copy, and have to synchronously write all of them before committing (i.e., R = 1 and W = N). Read operations are prioritized over write ones, which will result to be really slow and with a low probability of succeeding in a highly distributed system prone to network partition.
-
Fault tolerant system: This corresponds to the configuration N = 3; R = 2; W = 2. This means that we keep three copies of each granule (in different nodes of the distributed system), and before reading or writing, users need to access two of these copies. Note that with this configuration the system is up even in the event of failure of one node.
-
-
Weak consistency: The system does not guarantee that subsequent read will return the updated value. This happens when W + R ≤ N. With massive replication for read scaling, some systems are configured with R = 1 and a relatively small value for W (e.g., 1 or 2). The time elapsed between the commit of the modification and the end of its propagation to enough copies is the inconsistency window.
-
Primary copy: Users can read any copy, and there is one master copy they have to write before committing (i.e., R = 1 and W = 1). Changes in the master copy are eventually propagated to all copies.
-
Eventually consistent: Users read some copies, and write also some copies before committing. All copies are eventually synchronized to converge to the same values. You should notice that in this configuration, conflicting modifications can be committed in different nodes. If this is the case, they have to be concilied as explained in next section.
-
5.1.BASE Transactions
Summary
Self-evaluation
2. The only change in the schedule is that at time 180, T2 aborts (because of overwriting the value of granule E that T3 read), and consequently T3 also has to be aborted (because it read a value of E that never existed).
3. There is not any change in the schedule and no transaction is aborted. After time 180, the versions are as follows:
Granule |
RTS |
WTS |
---|---|---|
A0 |
3 |
0 |
B0 |
1 |
0 |
C0 |
1 |
0 |
D0 |
2 |
0 |
E0 |
2 |
0 |
F0 |
1 |
0 |
A1 |
3 |
3 |
B1 |
3 |
1 |
D1 |
2 |
2 |
4. We have two different cycles: T5–T1 that would be detected at site DBA; and T2–T4–T6 that would be detected at site DBC.
Glossary
- ACID
- This acronym defines a kind of transactions that guarantee the following properties: Atomicity, Consistency, Isolation and Durability.
- BoT
- Beginning of transaction
- Conflicting transactions
- Two transactions have a conflict when they access the very same granule and at least one of them modifies it (rougthly speaking, it means that the relative execution order between them affects the result of the execution).
- Contention
- Reduction of the throughput of the system due to mutual exclusion.
- DB
- Database
- DDB
- Distributed Database
- DDBMS
- Distributed Database Management System
- DBMS
- Database Management System
- DDL
- Data Definition Language, refers to SQL statements that affect the schema of the tables (i.e., CREATE, DROP, etc.).
- DML
- Data Manipulation Language, refers to SQL statements that affect the data in the tables (i.e., mainly INSERT, UPDATE, and DELETE).
- EoT
- End of Transaction
- Granule
- Unit of data the DBMS is locking (i.e., records, blocks, files, etc.).
- NOSQL
- Not Only SQL
- Read Set (RS)
- Set of tuples read by a transaction.
- Read Timestamp (RTS)
- Timestamp of the last transaction that read a granule.
- Recoverability
- Area of study dealing with the behavior of a DBMS in case of power or hardware failure.
- SQL
- Structured Query Language
- Transaction
- Atomic execution unit of a DBMS
- Wait-For graph
- Directed graph where each node represents a transaction and each edge shows that a transaction is waiting for the other to free some granule.
- Write Set (WS)
- Set of tuples written by a transaction.
- Write Timestamp (WTS)
- Timestamp of the last transaction that wrote a granule.
- 2PL
- Two Phase Locking protocol
- 2PC
- Two Phase Commit protocol