La entrada nula que rompió mi agente de producción

Translated for your language. Leer el original.

AI-assisted draft.

GyaanSetu Editorialhace 2 semanas2min de lectura

En este artículo

𝗧𝗵𝗲 𝗡𝘂𝗹𝗹 𝗜𝗻𝗽𝘂𝘁 𝗧𝗵𝗮𝘁 𝗕𝗿𝗼𝗸𝗲 𝗠𝘆 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗔𝗴𝗲𝗻𝘁

The demo ran perfectly for three weeks. Every test input worked. Every output went to the right place. I thought the system was reliable.

Then a supplier sent an email with an empty subject line.

The agent expected a string to extract an order reference. Instead, it received a null value. It did not crash. That would have been better. It generated a fake order reference that looked real. The downstream system processed it. Nobody noticed for four hours.

Demos use inputs you expect. Production uses inputs you do not expect.

I run the agent operation at aienterprise.dk. I saw the full trace. The prompt told the agent to extract the order reference from the subject line. This works if the subject line exists.

If the subject line is missing, a large language model fills the gap. It invents something that looks correct. This is not random noise. It is structured noise. It is dangerous because it looks right. You can catch a failure. You cannot easily catch a confident, wrong answer.

I did not retrain the model. I did not change the prompt. I added a guard before the model call.

Now, a simple check runs first. It asks: is the subject field present and non-empty? If the answer is no, the message goes to a hold queue for a human. The agent never sees the bad input.

This guard is twelve lines of code. It is the most important thing I built this year.

The pattern is simple. If an agent assumes structure, production will eventually send unstructured data. The fix is not a smarter model. The fix is a boundary. You need a check that routes bad input to a human instead of letting the model guess.

Reliability is the only feature. A demo shows an agent can do a task. Production shows an agent does the task at 3 AM on bad input. Only the second part matters to your customers.

My agent now processes 200 operations per day without issues. The hold queue triggers twice a week. A human reviews the odd data. I learn what production looks like.

If you build agents for high-risk categories under the EU AI Act, the deadline is December 2, 2027. This includes employment, biometrics, and education. A system that guesses on bad input will fail an audit. This guard is a compliance minimum.

Reliability is not a feature you add later.

La entrada nula que rompió mi agente de producción y cómo lo solucioné

He estado construyendo agentes impulsados por LLM durante un tiempo y, la mayoría de las veces, todo funciona perfectamente en mi entorno local. Pero la producción es un animal diferente.

Hace unas semanas, mi agente de producción falló. No fue un error de lógica complejo ni un problema de infraestructura. Fue algo mucho más simple: una entrada null.

El incidente

Sucedió un martes. Mi sistema de monitoreo empezó a lanzar alertas sobre errores inesperados en el bucle de ejecución de mi agente. Al investigar los logs, vi una ráfaga de errores AttributeError: 'NoneType' object has no attribute '...'.

¿El culpable? Un usuario había enviado un mensaje vacío, que mi frontend pasó como un valor null.

La causa raíz

La lógica de mi agente asumía que siempre recibiría un string válido. Tenía algunas comprobaciones básicas, pero no eran lo suficientemente robustas.

def run_agent(user_input: str):
    # Esta era mi lógica original
    prompt = f"User says: {user_input}"
    response = llm.invoke(prompt)
    return response

Cuando user_input era None, el f-string lo convertía en la cadena "None". El LLM, al ver la palabra "None", intentaba ser útil y respondía algo como "No entiendo" o, lo que es peor, devolvía un objeto JSON vacío que rompía mi lógica de procesamiento posterior.

La solución

La solución no fue simplemente añadir un if user_input is None. Eso es solo un parche. La verdadera solución fue implementar una validación de esquemas estricta en el punto de entrada utilizando Pydantic.

Creé un esquema para la entrada:

from pydantic import BaseModel, Field, validator

class AgentInput(BaseModel):
    query: str = Field(..., min_length=1)

    @validator('query')
    def query_must_not_be_empty(cls, v):
        if not v.strip():
            raise ValueError('Query cannot be empty')
        return v

Ahora, la entrada se valida antes de que toque siquiera al LLM. Si la entrada no es válida, el agente devuelve un mensaje de error claro en lugar de fallar.

Lecciones aprendidas

Valida todo: Nunca asumas que la entrada es lo que esperas.
Usa Pydantic para aplicaciones de LLM: Es un salvavidas para gestionar la naturaleza probabilística de los LLMs.
Falla rápido y con elegancia: Es mejor detectar un error pronto que dejar que se propague por todo tu sistema.

Construir agentes de IA es divertido, pero hacerlos aptos para producción requiere un enfoque en la robustez y la validación.

La entrada nula que rompió mi agente de producción

La entrada nula que rompió mi agente de producción y cómo lo solucioné

El incidente

La causa raíz

La solución

Lecciones aprendidas

Seguir leyendo

𝗬𝗼𝘂𝗿 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗣𝗮𝘀𝘀𝗲𝗱 𝗔𝗹𝗹 𝗧𝗲𝘀𝘁𝘀 — 𝗧𝗵𝗲𝗻 𝗙𝗮𝗶𝗹𝗲𝗱 𝗶𝗻 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻

Construyendo el bucle del agente de producción

Lo que aprendí al ejecutar agentes de IA en producción

El stack exacto que utilizo para construir agentes de IA de producción

Construir un entorno de pruebas para agentes de IA antes de la producción