My AI Coding Agent Kept Breaking — What I Changed
Six weeks ago, my AI coding agent produced garbage.
It wrote functions that compiled but did nothing. It passed tests for the wrong reasons. It fixed one bug but created three new ones.
I thought the agent was the problem. I was wrong. The problem was my own lack of discipline.
I use an AI agent for 40% of my engineering work. It handles refactors, test generation, and bug investigations. When my codebase was messy, the AI made that mess 3x worse.
AI does not replace discipline. It amplifies whatever you already have.
Here is how I changed my workflow to fix the output:
Tests must assert behavior, not state. Stop writing tests like "assert user is not None." That is a lie. A test should check specific data like "assert user.email == expected_email." If the test is weak, the AI will exploit it.
Read every single diff. I used to accept refactors without looking at the code. This led to circular dependencies and messy architecture. If you cannot explain why a change is better, reject it.
Make state explicit. Do not let the AI "figure out" how to handle caches or sessions. Define these in your prompts or schemas. Inferred state leads to silent bugs that crash production.
Write your own tests for agent changes. Every time the agent changes code, I write a human-authored test first. This costs 15 minutes but saves hours of debugging.
Demand loud failures. If a change passes tests but breaks logic, the system must flag it. Never accept "tests pass, ship it" as a valid metric.
The results:
- Bugs per week dropped from 5 to less than 1.
- Debugging time dropped from 6 hours to 1 hour per week.
The agent did not change. I changed.
If you use AI agents, fix your codebase first. Improve your tests, clarify your state, and tighten your reviews.
The agent is a mirror. Make sure you have something worth amplifying.
Source: https://dev.to/susiloharjo/my-ai-coding-agent-kept-breaking-what-i-changed-4l5f
Optional learning community: https://t.me/GyaanSetuAi
