Почему мы отказались от экономии 96% токенов

Translated for your language. Читать оригинал.

AI-assisted draft.

GyaanSetu Editorialна прошлой неделе2мин чтения

Почему мы отказались от экономии 96% токенов

Why We Rejected a 96% Token Saving

We found an MCP server that saves 96% on tokens. It uses one tool: execute_code. Instead of calling specific functions, the agent writes JavaScript to get data.

On paper, it wins. For complex tasks, code execution beats tool-calling for efficiency.

But we did not adopt it. We kept our discrete, named tools instead.

Here is why the obvious choice was the wrong choice for our agent.

The Target Determines the Design

Most people build for frontier models in a chat window. Those models have huge token budgets. For them, code execution is king.

We build for a voice-first agent on a small local model (Hermes 3 8B) on a boat.

For a small model, the constraint is not tokens. The constraint is reliability.

If a small model struggles to call a simple tool, asking it to write correct JavaScript is a much harder task. execute_code trades reliability for tokens. We cannot afford that trade.

The Last-Mile Problem

Code execution pushes the "last mile" of work onto the agent. The agent must:

Filter the data
Sort the results
Format the output

Our tools do this work inside the server. For example, when asking about battery state, our tool returns a string ready for text-to-speech. It says "68 percent, 12.8 volts" instead of raw numbers.

If we use execute_code, the agent must write the logic to format that speech. Small models fail at this constantly.

The Absence Rule

On a boat, sensors go offline. In our system, a missing sensor returns a clean null. This is a successful call.

In a code execution model, a missing sensor often throws an error. If a small model guesses a few wrong paths, it triggers error limits and breaks the agent. Named tools allow us to make absence a success, not a fault.

Adopt-vs-Build Checklist

Before you adopt or build an MCP server, ask these questions:

• Who is the target agent? (Frontier model vs. Small local model) • What is the binding constraint? (Tokens vs. Reliability) • Who does the last mile? (Does the tool format data or does the agent?) • How does it handle absence? (Is a missing value an error or a null?) • What is the maintenance cost? (Are you inheriting a dormant codebase?)

We did not ignore the other project. We harvested its ideas. We took their alarm handling and path discovery logic and added them to our roadmap.

Efficiency is great, but reliability is required when you are on the water.

Source: https://dev.to/clarkbw--/why-we-kept-named-mcp-tools-despite-a-96-token-saving-40ae

Optional learning community: https://t.me/GyaanSetuAi

Почему мы отказались от экономии 96% токенов

Продолжить чтение

Грязный секрет MCP: ваш агент сжигает токены

Инструмент MCP изменил свою схему. Ваш агент этого не заметил.

ИИ-агентам нужны границы, а не мастер-ключи

Your MCP Servers Are Burning Tokens Before You Type a Word