Marc Brooker · Aug 27, 2025 · 8:59 PM UTC

Marc Brooker · Aug 27, 2025 · 8:59 PM UTC

Marc Brooker

Marc Brooker @MarcJBrooker

Aug 27

More broadly, I don't think a single definition of 'durable' (as in ACID D) for transactions is particularly useful. Much more useful is to ask "what kinds of failures could cause committed transactions to be lost?"

Pekka Enberg

@penberg

Aug 27

A transaction is not durable if it survives application crash but not OS crash. A committed transaction is either durable or not!

Aug 27, 2025 · 8:59 PM UTC

Marc Brooker · Aug 27, 2025 · 8:59 PM UTC

Marc Brooker @MarcJBrooker

Aug 27

Depending on the database architecture, the answer could vary from "a strong breeze" to "the concurrent loss of multiple datacenters hundreds of miles apart".

Marc Brooker · Aug 27, 2025 · 8:59 PM UTC

Marc Brooker @MarcJBrooker

Aug 27

Making this more subtle, in many architectures the answer is different for recent transactions and older transactions. For example, architectures that do async replication have durability that goes from 'machine failure' to 'multiple machine failure' some time after commit.

Marc Brooker · Aug 27, 2025 · 8:59 PM UTC

Marc Brooker @MarcJBrooker

Aug 27

The classic ACID definition of D=sync-to-disk isn't a useful one for the modern system designer. Maybe not even as a useful minimum bar.

Marc Brooker · Aug 27, 2025 · 8:59 PM UTC

Marc Brooker @MarcJBrooker

Aug 27

Meeting a certain durability bar puts a lower bound on the possible commit latency for a database. In the DSQL design, we were very careful to parallelize the 1 RTT minimum latency for durability with the 1.5 RTT for concurrency control to get a total of 1.5 RTT.

Marc Brooker · Aug 27, 2025 · 8:59 PM UTC

Marc Brooker @MarcJBrooker

Aug 27

And remember that even the strongest durability doesn't protect against logical data loss (e.g. DELETE without WHERE) unless that's carefully designed into the system.

Abhishek Singh · Aug 27, 2025 · 9:05 PM UTC

Abhishek Singh

@0xlelouch_

Aug 27

Replying to @MarcJBrooker

“Durable” sounds absolute, but in real systems it’s always conditional. The better question is: under what failure modes does my commit guarantee break down? – If your DB writes to disk but the disk has silent corruption → durability is gone. – If you rely on replicas but they acknowledge before applying → a crash can roll you back. – If durability depends on fsync but the OS lies about flushing → you only have “apparent” durability. – Even cloud storage (S3, etc.) has durability SLAs, not absolutes. Each system makes tradeoffs: some optimize for performance (accepting narrow windows of data loss), others for safety (sacrificing throughput to survive extreme failures).