You write unit tests. You hit 90% code coverage. You deploy to production and sleep soundly. You shouldn’t.
SQLite is the most widely deployed database in the world. It lives in your smartphone, your browser, your car’s infotainment system. It is the definition of battle-tested infrastructure. And for 16 years, it harbored a fatal flaw that could silently corrupt your data.
We’ve built a digital civilization on the assumption that if our tests pass, our software is safe. That assumption is a lie.
Recently, the Canonical dqlite team discovered a rare, 16-year-old bug in SQLite’s Write-Ahead Log (WAL) checkpointing system. For a decade and a half, brilliant engineers ran millions of tests on this code. The bug survived every single one of them.
How did they finally find it? Not by writing more tests. They stopped running the code and started doing math.
They used TLA+, a formal specification language created by Leslie Lamport. TLA+ doesn’t execute code paths. It models the fundamental state transitions of a system. It proves, mathematically, whether a system can enter a corrupted state, regardless of how unlikely the timing might be.
Testing tells you what the code does. Formal verification tells you what the code is allowed to do.
You’ve probably noticed this in your own work. Concurrency bugs are ghosts. They don’t show up when you run your suite locally. They appear at 3 AM on a Tuesday when two specific network requests hit your server at the exact right millisecond. Your tests are blind to these ghosts because tests can only check the scenarios you already imagined.
The SQLite bug was a ghost. It was a specific, highly unlikely sequence of state transitions in the WAL system that no human would ever think to write a test for. But it was physically possible within the logic of the code. And in distributed systems, if a bad state is physically possible, it is inevitable.
If your software is critical infrastructure, running it and seeing if it breaks isn’t engineering. It’s gambling. You must mathematically prove its correctness.
A test is a snapshot of a moment you remembered to check. A proof is the boundary of what is physically possible.
The SQLite bug is patched now. But how many 16-year-old ghosts are hiding in your stack right now? How many silent corruptions are waiting for the perfect storm of concurrency?
Stop writing tests for problems you can imagine. Start proving your system against the ones you can’t.
FAQ
Q: Are you saying we should just stop writing tests?
A: No. Tests are great for catching regressions and verifying basic logic. But for complex, concurrent, or distributed systems, testing is fundamentally insufficient. You need tests for the code, and formal verification for the architecture.
Q: What is TLA+ and how does it actually work?
A: TLA+ is a formal specification language. Instead of writing code and running it, you write a mathematical model of your system's states and transitions. You then use a model checker to exhaustively explore every possible state, proving that the system can never enter a 'bad' state.
Q: Is formal verification realistic for a normal startup?
A: Probably not for your entire codebase. But if you are building critical infrastructure—distributed consensus, payment ledgers, or state machines where silent data corruption is unacceptable—you can't afford *not* to formally verify the core logic.