Your MCP Servers Are Burning Tokens Before You Type a Word

Translated for your language. Read the original.

AI-assisted draft.

You are paying for data you never use.

I tracked one agent session last week. It had 47 MCP tools loaded. Every tool sent its full JSON schema into the system prompt. This happened before I typed a single word.

Each tool schema uses 150 to 400 tokens. 47 tools cost me 11,000 tokens in overhead. The model reads these tokens on every single turn. You pay for this context even if you only use two tools.

Most people worry about large file uploads. They forget the cost of the tool menu itself.

When you stack servers like GitHub, Slack, and databases, you end up with 60 to 100 tools. I have seen sessions where tool definitions took up 20% of the entire context budget.

Stop loading everything at once. Use deferred loading instead.

Here is the pattern: • List tools by name and a short description only. • Use a search tool to fetch full schemas on demand.

Instead of injecting a massive JSON object for every tool, you provide a simple name. When the model needs a specific tool, it calls a search function. That function returns the full schema for only the matching tools.

The results are massive: • Eager loading 80 tools: ~18,000 tokens. • Deferred loading 80 tools: ~1,000 tokens.

This turns tool definitions from a major expense into a rounding error.

This strategy works because most sessions only use a small fraction of available tools. If you use every tool in a session, the cost remains the same. But for most users, this saves huge amounts of context.

Do not ask what tools the model needs to call. Ask what the model needs to know exists by default.

Most catalogs provide everything at once because it is easy. It is also the fastest way to burn your budget on a menu nobody reads.

Keep it simple. Provide a name, a description, and a search function. Pay for the three tools you use, not the eighty tools you ignore.

Source: https://dev.to/enjoy_kumawat/your-mcp-servers-are-burning-tokens-before-you-type-a-word-3076

Optional learning community: https://t.me/GyaanSetuAi

Your MCP Servers Are Burning Tokens Before You Type a Word

Continue reading

הסוד המלוכלך של MCP: הסוכן שלך שורף טוקנים

𝗧𝗵𝗲 𝗠𝗖𝗣 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗧𝗮𝘅

𝗟𝗮𝘇𝘆 𝗟𝗼𝗮𝗱𝗶𝗻𝗴 𝗠𝗖𝗣 𝗧𝗼𝗼𝗹𝘀: 𝗪𝗵𝗼 𝗦𝘂𝗽𝗽𝗼𝗿𝘁𝘀 𝗜𝘁 𝗮𝗻𝗱 𝗛𝗼𝘄

כלי ה-MCP שינה את הסכימה שלו. הסוכן שלך לא שם לב.