𝗧𝗵𝗲 𝗠𝗼𝗱𝗲𝗹 𝗗𝗼𝗲𝘀𝗻'𝘁 𝗥𝗲𝗺𝗲𝗺𝗯𝗲𝗿. 𝗬𝗼𝘂 𝗗𝗼.
I used to think Large Language Models (LLMs) had memory.
I thought each chat session stored its own context. I was wrong.
LLMs are stateless. The model remembers nothing from one request to the next.
When you see a chat history, you are looking at an array of messages. To create a conversation, you must send the entire history back to the model with every new prompt.
If you use an SDK, this process stays hidden. The SDK handles the complexity for you.
If you use raw fetch, you see everything. You manage the headers, the body, and the message array yourself.
This is how context works:
• You send a message. • The model responds. • You save both messages in an array. • You send the whole array back for the next question.
The model only knows what you send in the current request. Everything else is gone.
Understanding this array is the foundation of AI development. It is the starting point for advanced methods like RAG, sliding windows, and semantic search.
If you want to build reliable AI tools, stop relying on abstractions. Look at the raw requests. Control the history yourself.
Full post: https://dev.to/marcochavezco/the-model-doesnt-remember-you-do-38jk