𝗧𝗵𝗲 𝗠𝗼𝗱𝗲𝗹 𝗗𝗼𝗲𝘀𝗻'𝘁 𝗥𝗲𝗺𝗲𝗺𝗯𝗲𝗿. 𝗬𝗼𝘂 𝗗𝗼.
Large Language Models (LLMs) do not have memory.
I used to think every chat session stored its own context. I was wrong. When you talk to an LLM, it does not remember your last question unless you send it back.
The model is stateless. This means every request is a fresh start.
To create a conversation, you must manage the history yourself. You do this by sending an array of all previous messages with every new request.
The "memory" is just a list of messages:
- User: Hello.
- Assistant: Hi there!
- User: How are you?
If you do not include the first two lines in your next request, the model will not know you already said hello.
I learned this by avoiding SDKs. Most developers use tools like the Anthropic SDK to hide this complexity. The SDK handles the message history and headers for you.
If you want to learn how LLMs work, use raw fetch instead. Do not use an abstraction. When you manage the request and response cycle manually, you see every decision.
This manual control allows you to build advanced strategies later, such as:
- Sliding windows to manage long chats.
- Retrieval Augmented Generation (RAG).
- Semantic search.
Understanding this array is the foundation of AI development. You are the one providing the context. The model only knows what you send.
Source: https://dev.to/marcochavezco/the-model-doesnt-remember-you-do-3mmk