𝗘𝗺𝗮𝗶𝗹 𝗧𝗿𝗶𝗮𝗴𝗲 𝗧𝗮𝘅𝗼𝗻𝗼𝗺𝗶𝗲𝘀 𝗳𝗼𝗿 𝗟𝗟𝗠 𝗖𝗹𝗮𝘀𝘀𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻
The most important part of an email classifier is not the model. It is the label set.
Most people focus on prompt phrasing. They forget that the labels do the heavy lifting. If you get the taxonomy right, a cheap model works well. If you get it wrong, no model can save you.
A successful email taxonomy should follow these rules:
- Use four categories. Three categories lose detail. Five categories cause confusion.
- Map labels to actions. Do not use topics. Use response obligations.
- Define labels with examples. Use concrete instances instead of adjectives.
- Keep input small. Use the sender, subject, and a short snippet.
Consider this four-part structure:
- URGENT: Production incidents or executive requests. Reply within 1 hour.
- ACTION: Code reviews or follow-ups. Reply the same day.
- FYI: Information only. No response needed.
- NOISE: Newsletters or marketing. Archive it.
Each label maps to one specific behavior. If two labels lead to the same action, merge them. If one label leads to two different actions, split it.
This approach makes agents predictable. You can run them on a schedule without constant supervision. Use a temperature of 0 for classification to ensure the output is deterministic. Use a higher temperature for drafting to get natural prose.
Do not use free-form tags. Every new tag creates a new code path you must test. A closed vocabulary makes your system easy to audit and scale.
Try this exercise: Take your last 50 emails. Label them using these four buckets. Note where you feel hesitation. Those gaps show where your definitions need more examples.
Source: https://dev.to/qasim157/email-triage-taxonomies-for-llm-classification-3o1j
Optional learning community: https://t.me/GyaanSetuAi