𝗚𝗶𝘃𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁𝗚𝗮𝘁𝗲𝘄𝗮𝘆 𝗮 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰 𝗕𝗿𝗮𝗶𝗻

AI-assisted draft.

yesterday2min read

My AI agent routing used to be a mess.

I built a personal AI agent named Pi. It runs 24/7 from my living room. To save money, I used three different models:

Ollama (Local) for coding.
OpenAI for deep reasoning.
Gemini for fast tasks.

To choose the right model, I used a Python script with keyword lists. It was a simple if-else chain.

It failed constantly. If a user asked about Rust patterns without using my specific keywords, the router sent it to the wrong model. If a user spoke Hindi, it broke.

The results were bad:

18% of requests went to the wrong model.
I wasted money on expensive APIs for simple tasks.
I had to manually update keywords every week.

I needed a system that understood meaning, not just keywords.

I switched to the vLLM Semantic Router with AgentGateway. This changed everything.

Instead of a Python script, the Semantic Router works as an Envoy sidecar. It uses a small 130MB embedding model to understand the intent of every prompt. You do not write keywords. You simply write a description of what each model does in a YAML file.

The results after two weeks:

Misrouted requests dropped from 18% to 3%.
Routing latency dropped from 45ms to 1ms.
Monthly API costs dropped from $24 to $14.
Maintenance is now zero.

The router uses embeddings to compare your prompt against your model descriptions. If you describe a model as a coding specialist, the router sends coding prompts there automatically. It even works across different languages.

If the router fails, the system stays online. I configured a fail-open policy. If the router crashes, the requests move to Gemini automatically. The agent never stops working.

I even found and helped fix two bugs in the source code related to ARM64 support on Apple Silicon. This is how open source should work. You find an issue, contribute a fix, and the whole community gets better.

If you build AI agents, stop using keyword matching. Use semantic routing to control your costs and improve your answers.

Source: https://dev.to/anup_sharma_86fa94612fe3c/giving-agentgateway-a-semantic-brain-with-vllm-semantic-router-inside-my-homelab-542f

Optional learning community: https://t.me/GyaanSetuAi

𝗚𝗶𝘃𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁𝗚𝗮𝘁𝗲𝘄𝗮𝘆 𝗮 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰 𝗕𝗿𝗮𝗶𝗻

Continue reading

𝗜 𝗕𝘂𝗶𝗹𝘁 𝗠𝘆 𝗢𝘄𝗻 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁. 𝗛𝗲𝗿𝗲 𝗶𝘀 𝗪𝗵𝗮𝘁 𝗡𝗼𝗯𝗼𝗱𝘆 𝗧𝗲𝗹𝗹𝘀 𝗬𝗼𝘂.

𝗔𝗜 𝗚𝗮𝘁𝗲𝘄𝗮𝘆: 𝗧𝗵𝗲 𝗖𝗲𝗻𝘁𝗿𝗮𝗹 𝗡𝗲𝗿𝘃𝗼𝘂𝘀 𝗦𝘆𝘀𝘁𝗲𝗺 𝗳𝗼𝗿 𝗘𝗻𝘁𝗲𝗿𝗽𝗿𝗶𝘀𝗲 𝗟𝗟𝗠𝘀

𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝗮 𝗦𝗮𝗳𝗲 𝗟𝗼𝗰𝗮𝗹 𝗔𝗜 𝗖𝗼𝗱𝗶𝗻𝗴 𝗔𝗴𝗲𝗻𝘁 𝘄𝗶𝘁𝗵 𝗡𝗼𝗱𝗲.𝗷𝘀

𝗟𝗟𝗠 𝗚𝗮𝘁𝗲𝘄𝗮𝘆𝘀: 𝗥𝗼𝘂𝘁𝗶𝗻𝗴, 𝗙𝗮𝗹𝗹𝗯𝗮𝗰𝗸𝘀, 𝗔𝗻𝗱 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰 𝗖𝗮𝗰𝗵𝗶𝗻𝗴

𝗙𝗿𝗼𝗺 𝗣𝗿𝗼𝗺𝗽𝘁𝘀 𝘁𝗼 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁𝘀: 𝗔 𝗙𝗿𝗼𝗻𝘁𝗲𝗻𝗱 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗲𝗿'𝘀 𝗚𝘂𝗶𝗱𝗲