𝗗𝟭 𝗥𝗲𝗮𝗱 𝗥𝗲𝗽𝗹𝗶𝗰𝗮𝘀 𝗛𝗮𝗱 𝟲 𝗦𝗲𝗰𝗼𝗻𝗱𝘀 𝗼𝗳 𝗟𝗮𝗴
A D1 read replica in Tokyo fell 6.1 seconds behind a write in North America.
I learned this from a tracker throttling wrong impressions. The documentation mentions eventual consistency. It does not give you a specific time to plan for.
I built a staleness probe to find the real numbers. The probe writes a row with a UUID and an epoch. It polls the replica until the row appears. It then records the delay.
Results from 200 probes in Asia:
- p50: 800ms
- p95: 3,400ms
- p99: 6,100ms
The lag is high if your primary is in North America and your users are in Asia.
I also faced a schema error. A migration ran on the primary. A Worker restarted. The first requests hit a replica before the new table arrived. The error said the table did not exist. The table was there, but the replica was behind.
I solved this by routing around the lag. I do not fight it.
Here is my design:
- The writer adds a written_at epoch to the row.
- The writer adds an X-D1-Written-At header to the response.
- The reader compares that header to the data from the replica.
- If the replica data is older than the header, the reader falls back to KV.
KV runs under 500ms in the same region. It is free for up to 10M reads per day. This provides a cheap way to get fresh data for critical flags.
You only use KV during the short window when the replica is behind. Most reads hit D1 normally once the replica catches up.
I shared the full script and the migration pattern in my detailed post.