Introduction
Concurrency control is crucial in database management systems (DBMS) to ensure data consistency and integrity when multiple transactions execute simultaneously. However, improper handling of concurrent transactions can lead to anomalies such as Dirty Read, Non-Repeatable Read, and Phantom Read. Understanding these anomalies helps in choosing the appropriate isolation level to maintain data consistency.
1. Dirty Read
Definition:
A Dirty Read occurs when a transaction reads uncommitted changes made by another transaction. If the other transaction rolls back, the read data becomes invalid or “dirty.”
Example:
Consider two transactions, T1 and T2 operating on the accounts
table.
- T1 updates the balance of account A from 5000 to 7000 but has not yet committed.
- T2 reads the balance as 7000.
- T1 rolls back, reverting the balance back to 5000.
- T2 now has incorrect data, as the balance it read (7000) never actually existed in a committed state.
-- Transaction T1 BEGIN TRANSACTION; UPDATE accounts SET balance = 7000 WHERE id = 1; -- No COMMIT yet -- Transaction T2 SELECT balance FROM accounts WHERE id = 1; -- Reads 7000 (dirty read) -- Transaction T1 ROLLBACK ROLLBACK;
Solution:
To prevent dirty reads, use the Read Committed or higher isolation level.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
2. Non-Repeatable Read
Definition:
A Non-Repeatable Read occurs when a transaction reads the same row twice but gets different values because another transaction modified and committed changes between the two reads.
Example:
- T1 reads the balance of account A as 5000.
- T2 updates the balance of account A to 7000 and commits.
- T1 reads the balance of account A again and gets 7000.
- The two reads in T1 return different results, leading to a non-repeatable read anomaly.
-- Transaction T1 BEGIN TRANSACTION; SELECT balance FROM accounts WHERE id = 1; -- Reads 5000 -- Transaction T2 BEGIN TRANSACTION; UPDATE accounts SET balance = 7000 WHERE id = 1; COMMIT; -- Transaction T1 (Continued) SELECT balance FROM accounts WHERE id = 1; -- Reads 7000 (non-repeatable read)
Solution:
To prevent non-repeatable reads, use the Repeatable Read or higher isolation level.
To prevent non-repeatable reads, use the Repeatable Read or higher isolation level.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
3. Phantom Read
Definition:
A Phantom Read occurs when a transaction executes the same query multiple times and gets different results due to insertions or deletions by other transactions.
Example:
- T1 executes
SELECT COUNT(*) FROM accounts WHERE balance > 5000
and gets 5 rows. - T2 inserts a new row with a balance of 6000 and commits.
- T1 executes the same
SELECT
again and gets 6 rows.
-- Transaction T1 BEGIN TRANSACTION; SELECT COUNT(*) FROM accounts WHERE balance > 5000; -- Returns 5 -- Transaction T2 BEGIN TRANSACTION; INSERT INTO accounts (id, balance) VALUES (10, 6000); COMMIT; -- Transaction T1 (Continued) SELECT COUNT(*) FROM accounts WHERE balance > 5000; -- Returns 6 (phantom read)
Solution:
To prevent phantom reads, use the Serializable isolation level, which locks the range of rows affected.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Conclusion
Understanding concurrency anomalies helps in selecting the right isolation level to balance data consistency and performance. While higher isolation levels prevent anomalies, they also introduce overhead and reduce concurrency. Choosing the appropriate level depends on the application’s consistency requirements and performance trade-offs.