๐ช๐ต๐ ๐ฅ๐ฒ๐ด๐ฒ๐ ๐๐ฎ๐ถ๐น๐ฒ๐ฑ ๐ฎ๐ป๐ฑ ๐๐๐ ๐๐๐ป๐ฐ๐๐ถ๐ผ๐ป ๐๐ฎ๐น๐น๐ถ๐ป๐ด ๐ช๐ผ๐ฟ๐ธ๐ฒ๐ฑ
I spent three days writing regex to parse doctor emails. I wrote 400 lines of code. It worked for two formats. A third clinic joined. The code broke.
I needed structured data. I wanted dates, times, and names. The sources were messy. I had emails and Slack messages.
Old methods failed:
- Regex is brittle.
- Keywords miss context.
- NLP models need too much labeled data.
I switched to OpenAI function calling. I define a JSON schema. The model fills the schema. It returns a structured object.
I tested 50 emails. It handled these:
- Different date formats.
- Varied time formats.
- Missing fields.
- Different address styles.
Accuracy was 92%. The errors were relative dates like next Monday. I added the current date to the prompt. This fixed the issue.
There are trade-offs:
- Each call costs about 0.1 cents.
- Response time is 1 to 2 seconds.
- It is not for real-time typing.
But it saves hours of debugging.
Tips for your setup:
- Use a strict schema.
- Add examples to your system prompt.
- Validate the output.
- Provide today's date for context.
This works for invoices or meeting notes too. Use it for any messy text. Stop fighting with regex.
Source: https://dev.to/__c1b9e06dc90a7e0a676b/why-my-regex-based-parser-failed-and-how-llm-function-calling-saved-me-e01 Optional learning community: https://t.me/GyaanSetuAi