𝟰𝟮/𝟲𝟬 𝗗𝗮𝘆𝘀 𝗦𝘆𝘀𝘁𝗲𝗺 𝗗𝗲𝘀𝗶𝗴𝗻 𝗤𝘂𝗲𝘀𝘁𝗶𝗼𝗻𝘀
Your AI agent remembers a user name.
A user asks an agent to book a cheap flight to NYC. They ask for hotels under $150 per night. They ask for a total trip cost comparison.
By step three, the agent sends 8,000 tokens of history to the LLM. It answers like it is the first turn of the chat.
You need a memory architecture before you ship this.
Pick one:
In-context window: Keep the full history in the prompt. It is simple. It fails after 15 turns or 8,000 tokens.
Vector memory store: Embed past turns. Retrieve the best matches by similarity. This fails when a search for "NYC flight" pulls a memory from an old trip instead of the current task.
Episodic memory with summarization: Compress old turns into structured summaries. Inject relevant summaries into each request. It is harder to build. It is harder to confuse.
Redis session state: Use a structured key-value store. The agent reads and writes explicitly. It is deterministic. The agent must know what to store and when.
One option fails after 15 turns. One retrieves the wrong context at the wrong time. One is the correct choice for task-oriented agents.
Pick A, B, C, or D. Tell me if you faced this in production.
I share the full breakdown in the comments.
Source: https://dev.to/thejoud1997/4260-days-system-design-questions-4018
Optional learning community: https://t.me/GyaanSetuAi