Trust Isn't A Scalar: Typed Provenance for Agent Chains
I was wrong.
In my last post, I suggested using a simple true or false tag to track if an AI agent output was degraded. A commenter pointed out why this fails. A boolean is not enough. Trust is not a single number.
If you collapse trust into one score, you fail.
Imagine two different tasks using the same data:
- A summarizer needs a strong model but can handle old data.
- A price calculator needs fresh data but can handle a weaker model.
If the data is old and comes from a weak model, a single trust score forces a bad choice. You either reject everything or you let dangerous errors through.
The fix is typed provenance.
Instead of a single score, carry a vector of data. This vector tracks exactly what went wrong and how. You track different axes:
- Freshness: How current is the data?
- Capability: How strong is the model?
- Tool: Did the tools work?
- Verification: Was it checked against facts?
Each step in your chain then applies its own rules. The summarizer looks at the vector and says "this is fine." The price calculator looks at the same vector and says "this is too old, do not act."
This moves trust from a property of the data to a judgment made by the user of that data.
How to build this without making it too complex:
- Use a minimum value for each axis. Do not average scores. Averaging hides errors.
- Only add an axis if it changes your recovery action.
- If a freshness error means you must refetch, that is an axis.
- If a capability error means you must re-run on a better model, that is an axis.
- If two errors lead to the same fix, combine them.
Agent reliability is a provenance problem. You must track the lineage of every decision.
Source: https://dev.to/p0rt/trust-isnt-a-scalar-typed-provenance-for-agent-chains-229p
Optional learning community: https://t.me/GyaanSetuAi
