๐ง๐ต๐ฒ ๐ ๐ผ๐๐ ๐๐ฎ๐ป๐ด๐ฒ๐ฟ๐ผ๐๐ ๐๐ถ๐ป๐ฒ ๐ผ๐ณ ๐๐ ๐๐ผ๐ฑ๐ฒ
I pushed a breaking change to production. All tests passed. CI was green. The system did what I told it to do. It still broke.
I asked an AI agent to clean up a response. The agent removed a null phone number field. The payload looked cleaner. An old Android app crashed. It needed the field to exist.
Here is the problem. AI often writes the code and the test in one go. The test no longer guards the code. The test mirrors the code.
If the agent changes a field, it updates the test to match. The test passes because the behavior changed. I call these yes-man tests. They give you a green checkmark but no safety.
Breaking changes happen at the boundary between systems. The AI only sees your repo. It does not see the customer app. It does not see the partner API.
Stop relying on PR reviews for this. Humans miss missing keys in large diffs. Computers do not.
Change your safety layer:
- Use a frozen contract. Diff output against a fixed spec.
- Separate test writing from code changes.
- Move breaking-change detection to CI.
Writing code is no longer the bottleneck. Knowing if you broke a user is the bottleneck. Focus on verification.