𝗛𝗼𝘄 𝗮 𝗖𝗮𝗰𝗵𝗲𝗱 𝗥𝗲𝗮𝗰𝘁 𝗕𝘂𝗻𝗱𝗹𝗲 𝗦𝗲𝗻𝘁 𝗗𝗮𝘁𝗮 𝘁𝗼 𝘁𝗵𝗲 𝗪𝗿𝗼𝗻𝗴 𝗗𝗮𝘁𝗮𝗯𝗮𝘀𝗲
We hit a deadline. The backend team migrated to a new API and a new database. The frontend team updated environment variables in AWS Amplify and pushed the code.
The deployment was successful. We closed our laptops. We thought we were done.
We were wrong.
An engineer checked the logs on the old API server. This server was supposed to be dead. It was not. It was receiving real client requests and writing data to the old database.
For two hours, real client data went to the wrong place.
Here is why it happened and how we fixed it.
𝗧𝗵𝗲 𝗣𝗿𝗼𝗯𝗹𝗲𝗺
React apps on CDNs like AWS Amplify replace environment variables at build time. When you run a build, the bundler finds every variable and replaces it with a hardcoded string.
The API URL was physically embedded in the JavaScript file.
When we deployed, new users got the new bundle. But existing users with the app open never refreshed. They kept running the old bundle with the old URL hardcoded inside.
Because the old server was still running, these clients received a 200 OK status. Everything looked fine. The failure was silent. Silence is the most dangerous type of bug.
𝗧𝗵𝗲 𝗧𝗵𝗿𝗲𝗲-𝗟𝗮𝘆𝗲𝗿 𝗙𝗶𝘅
We built three layers to ensure this never happens again.
𝟭. 𝗥𝘂𝗻𝘁𝗶𝗺𝗲 𝗖𝗼𝗻𝗳𝗶𝗴𝘂𝗿𝗮𝘁𝗶𝗼𝗻 We stopped baking URLs into the JavaScript bundle. Instead, we use a config.json file in the public folder. The bundler does not touch this file. The app fetches this file at runtime before it renders. This ensures new sessions always get the correct URL.
𝟮. 𝗪𝗲𝗯𝗦𝗼𝗰𝗸𝗲𝘁 𝗡𝗼𝘁𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀 A runtime config does not help users with an open tab. We tied our deployment process to our WebSocket server. When Amplify finishes a build, it calls a webhook on our API. The server then pushes a message to all connected clients. If a user is on an old version, a banner appears asking them to refresh.
𝟯. 𝗖𝗮𝗰𝗵𝗲 𝗠𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁 We updated our CloudFront settings. Entry points like index.html and config.json are now set to no-cache. This ensures users always fetch the latest files instead of stale versions from a CDN edge node.
𝗧𝗵𝗲 𝗟𝗲𝘀𝘀𝗼𝗻𝘀
• La configuration au moment du build est un piège pour les valeurs qui changent entre les déploiements. • Le silence est plus dangereux que le bruit. Faites échouer bruyamment les anciens systèmes avec un statut 410 Gone. • La pression des échéances compromet les étapes manuelles. Automatisez votre processus de décommissionnement. • Surveillez ce que vous désactivez, pas seulement ce que vous activez.
Le déploiement ne consiste pas seulement à pousser du code. Il s'agit de s'assurer que chaque client finit par exécuter le bon code.
Source : https://dev.to/sugan_dev/how-a-cached-react-bundle-sent-production-data-to-the-wrong-database-55n9