Why We Rejected a 96% Token Saving
We found an MCP server that saves 96% on tokens. It uses one tool: execute_code. Instead of calling specific functions, the agent writes JavaScript to get data.
On paper, it wins. For complex tasks, code execution beats tool-calling for efficiency.
But we did not adopt it. We kept our discrete, named tools instead.
Here is why the obvious choice was the wrong choice for our agent.
The Target Determines the Design
Most people build for frontier models in a chat window. Those models have huge token budgets. For them, code execution is king.
We build for a voice-first agent on a small local model (Hermes 3 8B) on a boat.
For a small model, the constraint is not tokens. The constraint is reliability.
If a small model struggles to call a simple tool, asking it to write correct JavaScript is a much harder task. execute_code trades reliability for tokens. We cannot afford that trade.
The Last-Mile Problem
Code execution pushes the "last mile" of work onto the agent. The agent must:
- Filter the data
- Sort the results
- Format the output
Our tools do this work inside the server. For example, when asking about battery state, our tool returns a string ready for text-to-speech. It says "68 percent, 12.8 volts" instead of raw numbers.
If we use execute_code, the agent must write the logic to format that speech. Small models fail at this constantly.
The Absence Rule
On a boat, sensors go offline. In our system, a missing sensor returns a clean null. This is a successful call.
In a code execution model, a missing sensor often throws an error. If a small model guesses a few wrong paths, it triggers error limits and breaks the agent. Named tools allow us to make absence a success, not a fault.
Adopt-vs-Build Checklist
Before you adopt or build an MCP server, ask these questions:
• Who is the target agent? (Frontier model vs. Small local model) • What is the binding constraint? (Tokens vs. Reliability) • Who does the last mile? (Does the tool format data or does the agent?) • How does it handle absence? (Is a missing value an error or a null?) • What is the maintenance cost? (Are you inheriting a dormant codebase?)
We did not ignore the other project. We harvested its ideas. We took their alarm handling and path discovery logic and added them to our roadmap.
Efficiency is great, but reliability is required when you are on the water.
Source: https://dev.to/clarkbw--/why-we-kept-named-mcp-tools-despite-a-96-token-saving-40ae
Optional learning community: https://t.me/GyaanSetuAi
