𝗕𝗹𝗼𝗰𝗸𝗲𝗱 𝗜𝘀 𝗡𝗼𝘁 𝗙𝗮𝗶𝗹𝗲𝗱: 𝗔𝗴𝗲𝗻𝘁𝘀 𝗡𝗲𝗲𝗱 𝗕𝗼𝘂𝗻𝗱𝗮𝗿𝘆 𝗙𝗲𝗲𝗱𝗯𝗮𝗰𝗸
Most agent setups treat a blocked action as a tool failure.
An agent calls a tool. The request breaks a rule. The system returns a generic error. The tool call fails.
This seems fine at first. The unsafe action was stopped. But this only solves half the problem.
A generic error does not help an agent work within its limits. It turns a policy decision into noise. The agent might try to guess a fix. It might repeat the same mistake or try a different payload. This creates a loop of useless retries.
A blocked action should be a structured decision, not an unexpected crash.
When a request is blocked, the external system must not change. However, the response must tell the agent how to proceed safely.
Instead of a simple error, use a structured response.
Imagine an agent tries to write to a file that changed while it was planning. A generic error says "failed." A structured response says:
- Decision status: conflict
- Outcome status: no impact
- Reason: stale state
- Next action: re-read target state
Now the agent knows the goal is not impossible. It only needs to update its information. It stops guessing and takes the correct next step.
This works for many scenarios:
- If a path is out of scope, suggest an allowed path.
- If an effect already exists, suggest reusing the outcome.
- If the impact is too high, suggest waiting for human review.
This does not make the boundary soft. The action stays blocked. The system stays safe. You are simply turning a dead end into a guided path.
You must balance this with security. Precise feedback can help a bad agent probe your limits.
Use clear reason codes for operational friction like stale data or malformed inputs. If the agent shows suspicious behavior or ignores hints, switch to generic rejections or human reviews.
Keep agent feedback separate from audit scores. The agent needs to know how to be compliant. The system needs to know if the agent is behaving poorly. Do not mix these two jobs.
Boundaries exist because agents are becoming useful enough to act on real systems. Real work has rules and limits.
A boundary that only returns a failure is a wall. A boundary that provides guidance is a tool.
Blocked dovrebbe significare:
- L'impatto richiesto non si è verificato.
- La ragione è nota.
- La prossima azione sicura è chiara.
Fonte: https://dev.to/davidloibner/blocked-is-not-failed-agents-need-boundary-feedback-bbg
Community di apprendimento opzionale: https://t.me/GyaanSetuAi