𝗟𝗲𝘀𝘀𝗼𝗻𝘀 𝗳𝗿𝗼𝗺 𝗮 𝟭𝟬𝟵 𝗮𝗴𝗲𝗻𝘁 𝗰𝗼𝗱𝗲 𝗮𝘂𝗱𝗶𝘁 𝘄𝗼𝗿𝗸𝗳𝗹𝗼𝘄

📅2 days ago⏱2 min read

𝗟𝗲𝘀𝘀𝗼𝗻𝘀 𝗳𝗿𝗼𝗺 𝗮 𝟭𝟬𝟵-𝗮𝗴𝗲𝗻𝘁 𝗰𝗼𝗱𝗲 𝗮𝘂𝗱𝗶𝘁 𝘄𝗼𝗿𝗸𝗳𝗹𝗼𝘄

I spent 9.3M tokens on a code audit so you do not have to.

I used a swarm of AI agents to find bugs in a 5,000-line codebase. The system used mappers, finder lenses, deduplication, and adversarial verification.

It worked. I got 32 verified findings and a clean top-10 list. But it cost $46 in API fees. Most of that money was wasted.

Here is what went wrong:

• Verification was too expensive. 86 of the 109 agents were verifiers. They only caught 2 errors. I paid to re-read code 86 times for a 6% success rate.

• Mapping was redundant. The finders re-read the code anyway. The map phase was an extra tax.

• Finders overlapped. There was 30% overlap between the 8 lenses.

• Formatting wasted money. Using pretty-print JSON bloated every prompt by 40%.

• Cache reads were high. Every agent re-reads the same files from scratch.

How to fix your AI workflows:

• Rank before you verify. Find findings, deduplicate them, and rank them. Only verify the top 15. This uses 70% fewer agents.

• Match paranoia to stakes. Use one verifier for internal audits. Use a full panel only for findings that require real action.

• Batch verification by file. If 34 findings live in 10 files, make one verifier read the file once. Do not make ten verifiers read the same file.

• Skip mappers for small repos. If the code is under 10,000 lines, one agent can read it all.

• Limit your lenses. Use six lenses max. Give each lens clear boundaries so they do not repeat work.

• Compact your JSON. Do not use extra spaces or new lines in your JSON strings.

• Use cheaper models for chores. Use frontier models for logic. Use cheap models for deduplication and evidence checking.

• Set a token budget. Have your orchestrator check the remaining budget before starting new tasks.

The lesson is simple: Fan out to find, but converge before you verify. Breadth is for discovery. Rigor is for the survivors.

Source: https://dev.to/ayoubzulfiqar/lessons-from-a-109-agent-code-audit-workflow-4a5m

𝗟𝗲𝘀𝘀𝗼𝗻𝘀 𝗳𝗿𝗼𝗺 𝗮 𝟭𝟬𝟵 𝗮𝗴𝗲𝗻𝘁 𝗰𝗼𝗱𝗲 𝗮𝘂𝗱𝗶𝘁 𝘄𝗼𝗿𝗸𝗳𝗹𝗼𝘄

Continue reading

𝗧𝗵𝗲 𝗧𝗿𝗮𝗽 𝗼𝗳 𝗔𝗜 𝗖𝗼𝗱𝗶𝗻𝗴

𝗦𝘁𝗮𝘆𝗶𝗻𝗴 𝗩𝗶𝗴𝗶𝗹𝗮𝗻𝘁 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗜 𝗖𝗼𝗱𝗶𝗻𝗴 𝗚𝗼𝗹𝗱 𝗥𝘂𝘀𝗵

𝗟𝗲𝘀𝘀𝗼𝗻𝘀 𝗳𝗿𝗼𝗺 𝗮 𝟭𝟬𝟵 𝗮𝗴𝗲𝗻𝘁 𝗰𝗼𝗱𝗲 𝗮𝘂𝗱𝗶𝘁 𝘄𝗼𝗿𝗸𝗳𝗹𝗼𝘄

𝗧𝗵𝗲 𝗔𝗜 𝗥𝗲𝘃𝗶𝗲𝘄 𝗧𝗿𝗮𝗽: 𝗪𝗵𝘆 𝗩𝗲𝗿𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗠𝗮𝘁𝘁𝗲𝗿𝘀 𝗠𝗼𝗿𝗲 𝗧𝗵𝗮𝗻 𝗣𝗿𝗼𝗺𝗽𝘁𝗶𝗻𝗴

𝗪𝗵𝗮𝘁 𝗛𝗮𝗽𝗽𝗲𝗻𝗲𝗱 𝗪𝗵𝗲𝗻 𝗜 𝗧𝗼𝗹𝗱 𝗖𝗼𝗱𝗲𝘅 𝘁𝗼 𝗖𝗮𝗹𝗺 𝗗𝗼𝘄𝗻