𝗜 𝗚𝗮𝘃𝗲 𝗖𝗹𝗮𝘂𝗱𝗲 𝗖𝗼𝗱𝗲 𝘁𝗵𝗲 𝗞𝗲𝘆𝘀. 𝗦𝗼 𝗗𝗶𝗱 𝗮 𝗪𝗼𝗿𝗺.
AI coding agents are not being jailbroken. They are doing exactly what you built them to do. They use your credentials to run commands. The problem is that attackers can supply the input.
Recent vulnerabilities show three different ways this happens.
- The Supply Chain Worm A worm called Mini Shai-Hulud hit over 170 packages. It does not just steal keys and leave. It writes itself into your config files. It hides in .vscode/tasks.json or .claude/settings.json. These files run code automatically when you open a folder or start a session. Even if you delete the bad package, the malicious hook stays on your disk.
- The Allowlist Bypass The Cursor editor uses an allowlist to make auto-run safe. Attackers found a way around it using shell built-ins like export. By using prompt injection, an attacker makes the agent set a poisoned environment variable. This makes an approved command behave in a way you never intended. The security control failed because it was built for humans, not machines.
- The Protocol Flaw The mcp-remote proxy has a critical command injection flaw. If you connect to a malicious MCP server, it can execute commands on your machine during the handshake. This happens because the client trusts the server it reaches out to.
The core issue is simple. A coding agent erases the line between data and commands. An LLM sees instructions and outside data as the same thing. There is no boundary between what you say and what the world says to the agent.
How to protect yourself:
- Use short-lived tokens instead of long-lived keys in your environment variables.
- Turn off auto-run for any task that touches secrets or production.
- Watch your config files like .claude/settings.json for unexpected changes.
- Treat provenance attestations as proof of origin, not proof of safety.
- Pin your dependencies to specific hashes.
Treat your AI agent like any other high-privilege process. It needs strict boundaries.
If you run agents in auto-run mode, how do you decide when to let it work and when to stop it?
Source: https://dev.to/kkierii/i-gave-claude-code-the-keys-so-did-a-worm-34a4
Optional learning community: https://t.me/GyaanSetuAi