๐ฆ๐ฐ๐ต๐ฒ๐บ๐ฎ ๐ฉ๐ฎ๐น๐ถ๐ฑ๐ฎ๐๐ถ๐ผ๐ป ๐๐ ๐ก๐ผ๐ ๐๐ป๐๐ฒ๐ป๐ ๐ฉ๐ฎ๐น๐ถ๐ฑ๐ฎ๐๐ถ๐ผ๐ป
You use Pydantic for tool calls. You think the agent is safe. It is not.
Pydantic checks the shape. It does not check the intent.
We tracked 40 tool call failures.
- 9 were schema errors. The validator caught these.
- 18 used the wrong tool. These passed.
- 13 used the right tool with wrong values. These passed.
31 of 40 calls sailed through validation. They looked correct but were wrong.
A call to cancel an order is structurally perfect. But the user wanted to cancel a subscription. The validator sees a string ID. It passes the call. The user stays angry.
Shape is not intent.
Fix this with a deterministic pre-check.
Check the state before the tool runs.
- Does the ID exist?
- Does the user own it?
- Is the status correct for this action?
This stops the wrong-argument errors.
Wrong tool selection is harder. An LLM judge often makes the same mistakes as the agent.
For destructive tools, use a human confirmation step. Ask the user to agree before the action happens.
Optional learning community: https://t.me/GyaanSetuAi