When Users See Ghost Posts
Itâs 11 PM on a Friday. Sarah just finished writing a thoughtful comment on the companyâs internal discussion forum. She hits âPostâ, sees the success message, refreshes the page to admire her prose, and⊠nothing. Her comment has vanished.
She refreshes again. Still gone. Refreshes a third time - there it is! Finally. But those few seconds of panic made her wonder if the system was broken.
The database didnât lose anything. Sarah just discovered the hard way what happens when reads and writes hit different servers in a distributed system.
The Setup That Breaks
Hereâs the classic architecture that causes this mess:
User writes â Master DB (succeeds immediately)
User reads â Replica DB (still syncing... doesn't have the new data yet)
The master accepted Sarahâs comment and returned success. But that write takes a few hundred milliseconds to propagate to the read replicas. When Sarahâs browser refreshed and sent a read request, it hit a replica that hadnât caught up yet.
This propagation delay is called replication lag. Every distributed system has it. The question is whether you can tolerate it.
Why This Happens: Two Consistency Worlds
Eventual Consistency says: âGive me enough time with no new writes, and I promise all replicas will show the same data eventually.â
That âeventuallyâ could be 100 milliseconds. Could be 5 seconds. Could be longer if things get busy. The system doesnât block writes waiting for replicas to sync - it just shoves updates into the replication queue and moves on.
This is fast. Itâs scalable. Itâs also confusing as hell for users.
Read-After-Write Consistency (also called read-your-writes) says: âIf you wrote it, you should see it immediately when you read.â
This doesnât mean everyone sees your update instantly. Just you, the person who made the change. Other users can wait for eventual consistency to catch up. But you expect to see your own changes right away, and thatâs a reasonable expectation.
The Real-World Impact
This isnât just annoying. Itâs genuinely confusing:
- Post a tweet, refresh, itâs gone - âDid Twitter lose my post?â
- Update your profile photo, reload, old photo shows - âWhy didnât it save?â
- Transfer money between your accounts, check balance, transfer didnât happen - âDid my bank just steal $500?â
- Add items to shopping cart, go to checkout, cart is empty - âThis site is broken.â
Users donât understand replication lag. They think your system lost their data. And honestly, from their perspective, it did.
The Classic Fix: Pin Users to Master
The simplest solution? After someone makes a write, send all their reads to the master for a while:
// User just posted a comment
await db.master.insert(comment);
// Pin this user to master for 10 seconds
session.pinToMaster(userId, duration: 10000);
// Their next read goes to master, sees their comment
const posts = await db.master.query(userId);
The pinning window should be longer than your typical replication lag. If your replicas usually catch up within 2 seconds, pin for 10 seconds to be safe.
The problem: If your system is write-heavy, this defeats the entire purpose of read replicas. Everyoneâs pinned to the master all the time, and your master drowns in traffic.
The Smarter Fix: Fallback to Master
Donât pin users upfront. Instead, try the replica first. If the data isnât there, fall back to the master:
// Try replica first
let post = await db.replica.findById(postId);
if (!post) {
// Not found? Could be replication lag
// Check the master
post = await db.master.findById(postId);
}
return post;
This works great when:
- Replication lag is usually short
- âNot foundâ errors are rare (most queries find data)
- Youâre reading recent data that might not be replicated yet
It falls apart when youâre reading lots of data that legitimately doesnât exist. Every 404 becomes an expensive master query.
Note: This only catches new records. It wonât detect updates to existing records. Sarahâs comment would show up, but edits to that comment might still appear stale.
The Session Token Trick
This is the most elegant solution, but it requires your database to track versions. Hereâs how it works:
Every write operation gets a version number (or timestamp). When you write to the master, it returns this version along with the success response:
// Write returns a commit token
const result = await db.master.insert(comment);
const token = result.commitToken; // e.g., "replica-version:47293"
// Store token in user's session cookie or header
response.setHeader('X-Session-Token', token);
Now when Sarah refreshes the page, her browser sends that token back:
// Browser sends: X-Session-Token: replica-version:47293
const sessionToken = request.headers['X-Session-Token'];
// Load balancer picks a replica
const replica = pickRandomReplica();
// Replica checks: "Am I at least at version 47293?"
const posts = await replica.queryWithConsistency({
userId: userId,
minVersion: sessionToken
});
The replica has three choices:
- Itâs caught up (at version 47293 or higher): Serve the read immediately
- Itâs close (at version 47290): Wait a few milliseconds for replication to catch up, then serve
- Itâs way behind (at version 45000): Redirect this specific request to the master
The key insight: Sarahâs reads can hit any replica, not just one pinned replica. The token tells each replica âI need data at least this fresh.â If the replica canât guarantee that, it knows to either wait or bail out to the master.
Real-world example: Azure Cosmos DBâs session consistency works exactly like this. Each write returns a session token, and clients include that token in subsequent reads. Any replica in any region can serve the read, as long as itâs caught up to that version.
This scales much better than pinning users to specific servers because:
- Load balancers can distribute reads freely
- Replicas only redirect when theyâre genuinely behind
- No artificial 10-second pinning windows that might be too short or too long
When Eventual Consistency is Fine
Not everything needs read-after-write consistency:
- View counters: Who cares if the count is off by a few?
- Analytics dashboards: 30-second staleness wonât kill anyone
- News feeds showing othersâ content: You donât need instant updates on what strangers posted
- Product recommendations: Slight delays donât matter
But for your own data? Read-after-write is table stakes. Users must see their own changes immediately, or theyâll assume your system is broken.
The Underlying Tradeoff
Hereâs the fundamental tension in distributed systems:
Eventual consistency gives you speed and availability. Writes complete fast. Reads spread across many replicas. System stays up during network issues.
Strong consistency (where reads always return the latest write, even from other users) gives you correctness but sacrifices speed and availability. Every write might need to sync everywhere before completing.
Read-after-write is the middle ground. Your own writes are immediately visible to you. Everyone elseâs writes can be eventually consistent. Itâs the sweet spot for most user-facing applications.
What Actually Happened to Sarah
Sarahâs writes went to the master in us-east. Her reads were load-balanced to a replica in us-west with 300ms replication lag. Her browser refreshed twice before the replica caught up.
The fix? Pin her session to a nearby replica for 15 seconds after posting. Or better yet, include her session token with reads so any replica can check if itâs caught up enough to serve her request.
The system didnât lose her comment. But it made her think it did, which is almost worse.
Sometimes the most invisible part of your infrastructure is the most important: making sure people can see what they just did.