𝗜 𝗧𝗿𝗶𝗲𝗱 𝗧𝗼 𝗔𝗱𝗱 𝗔𝗜 𝗖𝗵𝗮𝘁 𝗧𝗼 𝗠𝘆 𝗔𝗽𝗽 𝗔𝗻𝗱 𝗛𝗶𝘁 𝗔 𝗪𝗮𝗹𝗹

Translated for your language. Read the original.

AI-assisted draft.

நேற்று முன் தினம்2min read

I tried to add an AI chat assistant to my project management tool. I wanted users to ask questions about overdue tasks or meeting notes. It seemed easy. I thought I would just call an API and finish. I was wrong.

After 15 messages, the AI became slow and incoherent. The API started throwing errors because the conversation was too long. I used GPT-4 with an 8k token limit. Every message included long descriptions and notes. The history grew too fast.

I tried three different fixes:

Truncating history: I kept only the last few messages. This saved speed but the AI forgot everything else.
Summarization: I asked an AI to summarize the chat every 5 messages. This helped memory but increased my costs and latency.
Relevance scoring: I tried to keep only the most relevant messages. This required a vector store and added too much complexity.

I realized I needed a better strategy. I settled on two methods: streaming and a fixed context window.

Streaming makes the app feel fast. Users see text appear instantly instead of waiting for the full reply. I used Server-Sent Events to send chunks of text as they arrive.

I also split my context into three parts:

System prompt: A fixed set of instructions.
Dynamic context: Recent project updates and task states.
Conversation history: A sliding window of recent messages.

I do not send the whole history every time. I only send enough to answer the current question. This reduced my payload size by 40%. It saved me money and improved speed.

If you build AI features, remember: Streaming buys you speed. A good context strategy buys you intelligence.

How do you manage conversation memory in your apps? Do you use sliding windows or summarization?

Source: https://dev.to/__c1b9e06dc90a7e0a676b/i-tried-to-add-ai-chat-to-my-app-and-hit-a-wall-with-context-tokens-459b

𝗜 𝗧𝗿𝗶𝗲𝗱 𝗧𝗼 𝗔𝗱𝗱 𝗔𝗜 𝗖𝗵𝗮𝘁 𝗧𝗼 𝗠𝘆 𝗔𝗽𝗽 𝗔𝗻𝗱 𝗛𝗶𝘁 𝗔 𝗪𝗮𝗹𝗹

Continue reading

AI-க்கான உரையாடல் சூழல் மேலாண்மை

𝗛𝗼𝘄 𝗜 𝗦𝘁𝗼𝗽𝗽𝗲𝗱 𝗠𝘆 𝗔𝗜 𝗙𝗲𝗮𝘁𝘂𝗿𝗲 𝗳𝗿𝗼𝗺 𝗗𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝗠𝘆 𝗪𝗮𝗹𝗹𝗲𝘁

ஸ்ட்ரீமிங் மற்றும் கேச்சிங் மூலம் AI லேட்டன்சியை (Latency) நான் எவ்வாறு சரி செய்தேன்

மனநிலை மாறாமல் ஒரு ஸ்ட்ரீமிங் AI சாட் கிளையன்ட்டை நான் உருவாக்கினேன்

எனது செயலியில் AI சாட்டைச் சேர்க்க முயன்றேன், ஆனால் ஒரு பெரிய தடையைச் சந்தித்தேன்