Two-phase Locking (2PL)

4 minute read

Optimistic vs Pessimistic concurrency control
Optimistic
- Low contention on shared objects
- Retries and roll back (might incur overhead if highly contended data objects); application side logic
Pessimistic
- Acquire locks on required objects for exclusive access then do the operations

2PL is Pessimistic

Race conditions

Multiple clients may simultaneously write to the database, overwriting one another’s modifications.
Data that has only been partially updated may be seen by a client and can be confusing.
Race conditions among clients can result in unexpected bugs.

A transaction is a means for an application to logically unite multiple reads and writes
If a transaction fails, it is safe to retry
An application can handle errors considerably more quickly with transactions since partial failure — when some operations succeed, and others fail — is not a concern.

We use isolation to separate concurrently running transactions from one another, making it impossible for them to be interdependent
The ability of each transaction to act as though it is the only one currently active throughout the whole database is known as serializability

The lock manager checks for conflicts in locks acquiring, against incoming operation request and the already set locks by the running operations
- The lock manager also sends the approved operations to the database manager for its executions and saves the record of the acquired locks for further conflict lookups till the operation completes
The lock manager can’t release any lock for any operation until the database manager has marked that operation completed (abort or committed) and notified the lock manager
- This rule ensures that the database manager performs operations on a data item in the same order that the scheduler sends them
The lock manager will not allow any transaction to acquire any more locks for any data item once it has released even a single lock

Shared lock
- Lock shared by multiple transactions reading data from an object in the db
- Known as read-only lock, since transactions can only read the object by acquiring this lock and can’t update it
Exclusive lock
- An exclusive lock allows a single transactions to read and write from an object in the database
- No two transactions are allowed to interact with the same object simultaneously

2PL must have shared lock for reading.
- The only limitation is if there is already an exclusive lock placed on the object, the transactions trying to access that object must wait for the completion (commit or abort) of the previous transaction.
Exclusive lock for writing
- If the object already has a lock placed on it, either shared or exclusive, the transaction must wait until the previous transaction finishes, since this phase doesn’t allow more than one transaction to hold the lock for the same object
Lock upgrade - a transaction needs to upgrade its lock from shared to exclusive if it reads and then writes an object.

A db that supports transactions
- MySQL, PostgresQL
A lock manager
- central managing component for all the locking and unlocking approvals
- Once a transaction requests a lock, the lock manager must be able to look into its table of existing locks for a match. If found, it must also queue the new requests to entertain them afterward, in the designated way.

A Graph is generated using transactions
A cycle means a deadlock
The DBMS will choose a victim transaction to roll back when it notices a deadlock to break the loop.
When choosing a victim, there are many transaction properties to consider
1. By age (newest or oldest timestamp)
2. In order of query execution progress (least/most)
3. Based on the number of locked objects
4. Based on the number of transactions, we need to roll back along with it
5. The number of times a transaction has previously been restarted

If another transaction currently holds the lock when a transaction tries to acquire it, we must take some action to avoid a deadlock. In general, to prevent a deadlock, the lock manager checks whether or not adding a new lock request results in a deadlock, if it does, it simply does not execute that transaction
While waiting for a lock, a transaction can take only one direction—wait or abort. There are some specific schemes to achieve deadlock prevention. They operate based on the timestamps assigned to the transactions, defining priorities, e.g., older means higher priority.
- Wait-Die (“Old waits for Young”)-
- Wound-Wait (“Young waits for Old)
  - Wound-wait can provide low latency for high-priority transactions because they don’t have to wait for completion for lower-priority transactions
  - However, it comes at the cost of wasted computation, which happens when an aborting transaction throws away its work and re-does it later.
  - When a transaction restarts, its old timestamp becomes its (new) priority

Poor throughput and high query response time
- Reduced concurrency - one transaction has to wait for the preceding one to finish
Unlimited transaction time by databases
- Might cause starvation if one transaction is taking forever
Unstable latencies
- It might just take one slow transaction or a transaction that accesses numerous objects and acquire multiple locks simultaneously to clog the whole transaction pipeline
Frequent deadlocks
- More frequent in 2PL
- Bottleneck - retries
2PL is sufficient to ensure conflict serializability on its own. It creates schedules with an acyclic precedence graph. However, it is vulnerable to cascade aborts, which occur when one transaction fails, and the application must roll another back, causing a lot of wasted resources and effort.