𝗧𝗵𝗶𝗻𝗸𝗶𝗻𝗴 𝗧𝗼𝗸𝗲𝗻𝘀 𝗗𝗿𝗶𝘃𝗲 𝗛𝗶𝗱𝗱𝗲𝗻 𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗖𝗼𝘀𝘁𝘀

Translated for your language. Original lesen.

AI-assisted draft.

GyaanSetu Editorialvor 8 Stunden1Min. Lesezeit

Thinking tokens create a hidden tax for AI developers.

OpenAI, Anthropic, and Google charge for thinking tokens at output rates. This increases costs by 5x to 10x in agentic pipelines. Most developers assume these tokens are free or cheap. They are not.

Agentic pipelines make this problem worse. Agents often retry failed steps. Each retry generates hundreds of new thinking tokens. A single loop of perceive, reason, act, and observe can lead to multiple retries.

The math is dangerous for your margins: • A task with 3 to 5 retries costs $0.10 to $0.50 in hidden tokens. • A pipeline with 10,000 tasks per day costs $5,000 to $25,000 in extra fees. • A startup spending $10,000 on APIs might pay $5,000 for thinking tokens alone.

A massive price war is starting. Google plans to cut Gemini reasoning model prices by 80%. This shows a gap between tech giants and startups. Google can afford to lose money on tokens because they commit billions to compute. Startups cannot.

This asymmetry favors large providers. Smaller companies struggle to absorb these costs. Even Microsoft is shifting to usage-based pricing and looking at cheaper alternatives like DeepSeek V4 to manage costs.

Watch for two things: • Google's official Gemini pricing in Q3 2026. • OpenAI's response to tiered pricing for thinking tokens.

Manage your token usage now or watch your margins disappear.

Source: https://pub.towardsai.net

Optional learning community: https://t.me/GyaanSetuAi

𝗧𝗵𝗶𝗻𝗸𝗶𝗻𝗴 𝗧𝗼𝗸𝗲𝗻𝘀 𝗗𝗿𝗶𝘃𝗲 𝗛𝗶𝗱𝗱𝗲𝗻 𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗖𝗼𝘀𝘁𝘀

Weiterlesen

𝗜 𝗖𝘂𝘁 𝗠𝘆 𝗔𝗜 𝗔𝗣𝗜 𝗖𝗼𝘀𝘁𝘀 𝗕𝘆 𝟳𝟬%

𝗠𝗖𝗣 𝗗𝗶𝗿𝘁𝘆 𝗦𝗲𝗰𝗿𝗲𝘁: 𝗬𝗼𝘂𝗿 𝗔𝗴𝗲𝗻𝘁 𝗜𝘀 𝗕𝘂𝗿𝗻𝗶𝗻𝗴 𝗧𝗼𝗸𝗲𝗻𝘀

𝗧𝗵𝗲 𝗠𝗖𝗣 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗧𝗮𝘅

Die Halluzinationssteuer: Die versteckten Kosten von 67 Milliarden Dollar

Die wahren Kosten von KI-APIs