๐ง๐ต๐ฒ ๐ ๐ฒ๐๐๐ฎ๐ด๐ฒ๐ ๐๐ฟ๐ฟ๐ฎ๐ ๐๐ป๐ฑ ๐๐๐ ๐๐ผ๐๐๐
This is part 3 of Building TinyAgent. We build an agent in Node.js. We use no frameworks. We use only API calls.
APIs have no memory. They are stateless. You must keep a messages array. You send this entire array on every call.
This creates a cost trap.
- Turn 1: You send 1 message.
- Turn 10: You send 19 messages.
- Turn 30: You send 59 messages.
Turn 30 costs 120 times more than turn 1. Costs do not move in a straight line. They curve upward.
Dev sessions are short. Real users have long chats. This makes production bills high.
Pick one of these three patterns:
- Full history. Send everything. Simple but expensive. Use for short chats.
- Sliding window. Keep the last 10 messages. Costs stay flat. Agent forgets early turns.
- Summarization. Use a cheap model to compress old turns. Context survives. Costs stay low.
Anthropic offers a bonus. Prompt caching. Mark a prefix as cacheable. You pay 10 percent of the input cost for cached parts. Long system prompts become cheap.
Measure your real conversations. Then pick your strategy.
Next: We give the agent functions. It will do things instead of only talking.
Source: https://dev.to/jasmin/the-messages-array-in-4-gifs-1k1j