Privacy Notice

We use cookies and similar technologies to improve your browsing experience. By continuing to use this site, you agree to our use of cookies.

SQLite for Durable Workflows: Ditch the Message Brokers

SQLite for Durable Workflows: Ditch the Message Brokers

热点 2026-05-30 14:00 👁 7 Views 📖 2 min read
SQLite durable workflows job queue state machine LiteFS

Every time I see a startup using Kafka for their job queue, I know two things: their deploy is slow, and their AWS bill is too high.

You've been told you need Kafka, RabbitMQ, or PostgreSQL for every workflow task. It's a lie designed to sell you infrastructure you don't need.

Why do we reach for complex workflow infrastructure? Because we've been conditioned to think that durability requires a cluster. That's cargo-cult engineering.

SQLite's atomic commit means crash recovery is free. If your process dies mid-write, SQLite guarantees your workflow state is either fully written or fully rolled back. No corruption. No manual repair.

This isn't theoretical. Square's original POS terminal ran SQLite as its workflow engine. Every transaction, every retry, every state transition lived in a single file. It processed millions of transactions without a message broker.

Mobile apps use SQLite for offline-first workflows. Sync happens when the network returns. The same pattern applies to server-side batch jobs.

LiteFS takes this further. It replicates SQLite across machines with read replicas and a single writer. You get horizontal scaling without leaving the SQLite ecosystem.

SQLite handles 10,000 writes per second on a Raspberry Pi. On a modern server, 50,000+ writes per second is normal. Your workflow doesn't need that throughput? Then you definitely don't need Kafka.

A SQLite workflow database takes 100 microseconds for a transaction. Kafka's p99 latency is measured in milliseconds. For most durable workflows, that difference doesn't matter. But the operational simplicity does.

When NOT to use SQLite: multi-writer scenarios and multi-region deployments. If you need concurrent writers across datacenters, SQLite's single-writer model will bottleneck you. That's when you reach for PostgreSQL or FoundationDB.

But here's the thing: most teams that reach for distributed databases don't have multi-region workloads. They have a single region and 10 services that all need to write to the same queue. That's a SQLite job.

The deeper truth: durable workflows are a state management problem, not an infrastructure problem. You need to persist a state machine. SQLite is the best state machine store ever built.

Think about it. Your workflow engine needs to track: current state, available transitions, retry count, timeout. That's a database schema, not a queue topology.

With SQLite, your workflow becomes a simple table: `CREATE TABLE workflows (id TEXT PRIMARY KEY, state TEXT, payload BLOB, retries INT, next_run_at DATETIME)`. Poll it. Process it. Update it. That's it.

No consumer groups. No partition rebalancing. No offset management. No 200-line YAML config files.

Watch for SQLite-based workflow frameworks like Litestream and LiteFS to keep maturing. The trend is clear: simpler infrastructure for most workloads, distributed only when you must.

Next time someone tells you to spin up Kafka for a job queue, ask them: "How many writes per second do we need?" If the answer is under 50,000, reach for SQLite. Your future on-call self will thank you.

A
Alex Chen

Alex covers tech, finance, and the intersection of business and policy. Previously at TechCrunch and The Information.

💬 Comments

No comments yet. Be the first!