๐ฆ๐๐ผ๐ฝ ๐๐ถ๐ด๐ต๐๐ถ๐ป๐ด ๐ฅ๐ฒ๐ด๐ฒ๐ ๐ช๐ถ๐๐ต ๐๐๐ ๐
I spent three days building a regex monster. It had 47 patterns. One missing space broke everything. I wanted to throw my laptop.
I tried a large language model. I used JSON mode. I told the model what fields I wanted.
I needed three things from emails:
- Order ID
- Intent
- SKU
Regex worked for 20% of emails. Real data is messy. Some orders had letters. Some SKUs were written differently.
I used gpt-4o-mini. It is fast. It is cheap. A few lines of code replaced my 47 patterns.
Follow these steps for reliability:
- Validate the JSON schema.
- Add 3 examples to your prompt.
- Log every failure.
LLMs are not for every task. Use regex for CSV files. Be aware of latency. Check your privacy rules.
Prompting is the new regex. It is easier to maintain. You change a prompt in seconds. Changing regex often breaks other things.
Start with the cheapest model. Mix regex for simple parts. Use LLMs for messy text.
Optional learning community: https://t.me/GyaanSetuAi