Most engineers can recite the four SQL isolation levels from memory. Fewer can define precisely which anomalies each level permits and which it forbids. This gap between familiarity and formal understanding leads to subtle, catastrophic bugs—invariants silently violated under concurrent load, financial totals that drift by pennies per hour, or inventory counts that occasionally go negative despite application-level checks.
The original ANSI SQL-92 definitions of isolation levels were, as Berenson et al. demonstrated in their landmark 1995 critique, ambiguous and incomplete. They failed to account for anomalies like write skew entirely and defined existing anomalies in terms of lock-based implementations rather than observable phenomena. This conflation of mechanism and specification has propagated confusion for three decades. Engineers assume their database's "repeatable read" matches the textbook definition, when in practice implementations diverge significantly—PostgreSQL's repeatable read is snapshot isolation, MySQL's InnoDB repeatable read uses next-key locking, and neither behaves identically to the ANSI specification.
What follows is a rigorous treatment of isolation levels grounded in anomaly prevention rather than implementation mechanism. We define each anomaly precisely, analyze where snapshot isolation diverges from true serializability, and compare the performance characteristics of lock-based versus multi-version concurrency control implementations. The goal is to give you a formal framework for reasoning about which isolation level your workload actually requires—not the one you habitually configure by default.
Anomaly Taxonomy: What Each Isolation Level Actually Prevents
A dirty read occurs when transaction T₂ reads a value written by transaction T₁ before T₁ commits. If T₁ subsequently aborts, T₂ has acted on data that never logically existed. Read Committed eliminates this anomaly by ensuring transactions only observe committed state. The bug this prevents is straightforward: without it, a transfer moving $500 between accounts could expose an intermediate state where the money exists in neither account, causing a downstream balance check to trigger a false overdraft.
A non-repeatable read (also called a fuzzy read) occurs when T₁ reads a row, T₂ modifies and commits that row, and T₁ re-reads the same row to find a different value. Repeatable Read prevents this. The practical consequence is more insidious than it appears. Consider a transaction that reads a user's permission level, performs a long computation, then reads the permission again to authorize a write. If the permission was revoked between reads, the transaction may authorize an action against a stale security check.
A phantom read occurs when T₁ executes a predicate-based query (e.g., SELECT WHERE salary > 100000), T₂ inserts or deletes a row satisfying that predicate and commits, and T₁ re-executes the same query to find a different result set. Note the distinction: non-repeatable reads concern modifications to existing rows, while phantoms concern the appearance or disappearance of rows matching a predicate. Serializable isolation eliminates phantoms.
Write skew is the anomaly the ANSI specification entirely missed. It occurs when two transactions each read an overlapping dataset, make disjoint writes based on what they read, and both commit—producing a state that neither transaction would have permitted had it seen the other's write. The canonical example: two doctors are on call, each checks that at least two doctors are on-call before requesting leave. Both see two doctors, both withdraw, and the invariant that at least one doctor remains on-call is violated. No individual row was written by both transactions, so row-level conflict detection misses it entirely.
Ordering these anomalies by the strength of isolation required to prevent them gives us the hierarchy: Read Uncommitted (permits everything), Read Committed (prevents dirty reads), Repeatable Read (additionally prevents fuzzy reads), Serializable (additionally prevents phantoms and write skew). The critical insight is that this hierarchy describes observable behavior guarantees, not implementation strategies. Any mechanism that provides the correct observable guarantees satisfies the level, regardless of whether it uses locks, timestamps, or validation.
TakeawayAn isolation level is defined by the set of anomalies it provably prevents, not by the mechanism it uses. If you cannot enumerate the specific anomalies your application must not exhibit, you cannot choose the correct isolation level.
Snapshot Isolation: The Serializability Gap
Snapshot isolation (SI) provides each transaction with a consistent snapshot of the database as of the transaction's start time. Reads never block writes, writes never block reads, and write-write conflicts are detected at commit time via a first-committer-wins rule: if two concurrent transactions modify the same row, the second to commit is aborted. This makes SI remarkably appealing for read-heavy workloads—it provides a consistent view without the overhead of shared read locks.
SI prevents dirty reads, non-repeatable reads, and phantom reads. By every criterion in the ANSI SQL-92 specification, it appears to satisfy Serializable. This is precisely the source of the confusion. SI does not prevent write skew, and write skew is not in the ANSI taxonomy. Many systems—PostgreSQL prior to version 9.1, Oracle's default isolation—provide SI and label it as something between Repeatable Read and Serializable, depending on the vendor. Engineers who trust the label get burned.
Consider a concrete write skew scenario in a banking system. The invariant is that the sum of two accounts belonging to the same customer must not go below zero. Account A holds $100, Account B holds $100. Transaction T₁ reads both accounts (total = $200), then withdraws $200 from Account A. Concurrently, T₂ reads both accounts (total = $200), then withdraws $200 from Account B. Under SI, both transactions see the same snapshot, both satisfy the invariant check, and both commit. The result: Account A = -$100, Account B = -$100. The invariant is violated by $200. No write-write conflict occurred—T₁ wrote A, T₂ wrote B—so SI's conflict detection was never triggered.
Fekete et al. formalized this gap by proving that SI admits exactly those non-serializable histories that contain a dangerous structure: a cycle in the serialization graph where two consecutive edges represent read-write (rw) dependencies, meaning both transactions read something the other would later write. This characterization is the foundation of Serializable Snapshot Isolation (SSI), implemented in PostgreSQL 9.1+. SSI monitors the serialization graph at runtime and aborts transactions that would complete a dangerous structure, adding serializability guarantees on top of SI's performance benefits.
The engineering implication is stark. If your application has invariants spanning multiple rows or tables—and almost every non-trivial application does—snapshot isolation alone is insufficient unless you add application-level conflict materialization. Techniques like materializing the conflict (introducing a dummy row that both transactions write, forcing a write-write conflict) work but shift the correctness burden from the database to the developer. SSI eliminates this burden at the cost of a modest increase in abort rates for transactions that exhibit the dangerous structure pattern.
TakeawaySnapshot isolation looks like serializability but permits write skew—the one anomaly that corrupts multi-row invariants. Whenever your correctness depends on constraints spanning more than a single row, verify whether your database actually provides serializability or merely snapshot isolation under a reassuring name.
Implementation Strategies: Locks vs. MVCC Under Load
Lock-based isolation (two-phase locking, 2PL) enforces serializability by requiring transactions to acquire shared locks for reads and exclusive locks for writes, with the constraint that no lock is released before all locks are acquired. Strict 2PL further requires that all locks are held until commit. This guarantees serializability but introduces direct contention: readers block writers, writers block readers, and long-running analytical queries can starve short transactional writes. Under workloads with high read-write contention on the same data, throughput degrades sharply as lock wait times dominate transaction latency.
MVCC-based isolation avoids this contention by maintaining multiple versions of each row. Readers access the version consistent with their snapshot timestamp and never acquire locks on data. Writers create new versions rather than modifying existing ones in place. This decouples reads from writes entirely, yielding dramatically higher throughput for mixed read-write workloads. The cost is paid in storage overhead for version maintenance, garbage collection of obsolete versions, and the complexity of version chain traversal during reads—particularly for long version chains on frequently updated rows.
The performance trade-off is workload-dependent and non-obvious. For write-contended workloads where many transactions update the same small set of rows, 2PL and MVCC converge in behavior: 2PL serializes access via locks, MVCC serializes via first-committer-wins aborts and retries. The abort-and-retry overhead of MVCC can exceed the lock-wait overhead of 2PL when contention is extreme because aborted transactions waste all computation performed before the abort. Benchmarks on TPC-C-like workloads show that under moderate contention, MVCC outperforms 2PL by 2–5×, but under extreme contention on hotspot rows, the gap narrows or inverts.
Predicate locking—required to prevent phantoms under 2PL—deserves special attention. A true predicate lock covers all rows satisfying a predicate, including rows that do not yet exist. Because evaluating arbitrary predicates for overlap is computationally expensive, most systems approximate predicate locks with index-range locks (next-key locking in InnoDB). These lock a range of index entries, preventing inserts into the range. The approximation is conservative: it may block transactions that would not actually produce a phantom, reducing concurrency. Under workloads with frequent range scans and concurrent inserts into those ranges, index-range locking becomes a significant bottleneck.
SSI represents a third strategy: optimistic execution with post-hoc conflict detection. Transactions execute without blocking, the system tracks read-write dependencies, and transactions are aborted only when a dangerous structure is detected. This yields MVCC's read performance while providing true serializability. The cost is a higher abort rate compared to SI alone—typically 1–5% of transactions under realistic workloads—plus the memory overhead of tracking dependency information. For workloads where read-write conflicts are infrequent but correctness is non-negotiable, SSI offers the best trade-off: the vast majority of transactions commit without blocking, and only the rare conflicting few pay the retry penalty.
TakeawayNo isolation implementation dominates across all workload patterns. The optimal choice depends on your contention profile: MVCC excels under moderate read-write contention, 2PL can outperform under extreme write hotspots, and SSI provides serializability at near-SI performance when conflicts are infrequent.
Database isolation levels are specifications of which anomalies a system guarantees will not occur. Treating them as anything less precise—as vague notions of "consistency" or vendor-specific labels—leads to bugs that surface only under concurrent load, resist reproduction, and violate invariants silently.
The practical framework is this: enumerate your application's invariants, determine which anomalies could violate them, select the weakest isolation level that prevents those specific anomalies, then verify that your database's implementation of that level matches the formal definition. The gap between naming and behavior—particularly around snapshot isolation and serializability—is where production incidents hide.
Correctness requirements dictate the isolation level. The workload profile dictates the implementation. Conflating these two decisions is the root of most isolation-related engineering failures. Keep them separate, reason about each precisely, and the system will behave as specified under load.