๐ง๐ต๐ฒ ๐๐ด๐ฒ๐ป๐ ๐๐ถ๐ ๐ถ๐ป๐ด ๐๐๐ด๐ ๐๐ ๐ฅ๐๐ป๐ป๐ถ๐ป๐ด ๐๐ผ๐ฑ๐ฒ
Most AI agents guess. They read source code and pick a line to change. They are confident but wrong.
Source code is a hypothesis. The running program is the truth.
The fix is Harness Engineering. A harness is the system around the model. It includes tools and feedback loops.
A great harness uses an evidence loop:
- List hypotheses.
- Add logging.
- Reproduce the bug.
- Read logs.
- Write the fix.
- Remove logging.
One rule is vital. Never log and edit in one pass.
If you mix them, you lose the signal. You will not know if the log or the fix changed the result.
Treat logging as a probe. Treat code changes as interventions. Keep them separate.
You are the key. You trigger the bug. You click the button. The agent does the tedious work. You provide the reality.
Build your own evidence loop with these steps:
- Pick one log format the agent knows.
- Force the agent to log first and fix second.
- Make your reproduction loop fast.
Stop trying to make the model smarter. Build a better harness.
Source: https://dev.to/shreyshah/the-agent-that-fixes-bugs-by-running-the-code-4lbg Optional learning community: https://t.me/GyaanSetuAi