𝗦𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱 𝗢𝘂𝘁𝗽𝘂𝘁 𝗳𝗿𝗼𝗺 𝗟𝗟𝗠𝘀

📅1 day ago⏱1 min read

LLMs generate tokens, not data structures.

You ask for JSON. The model gives you valid JSON. Then, it adds a conversational sentence at the end. Your parser fails. Your pipeline crashes. Your system breaks at 2 AM.

To build production systems, you must enforce schemas at the token level.

There are three ways to do this:

Prompt-only JSON You tell the model to output JSON. This works about 85% to 95% of the time. It fails because the prompt is just a suggestion. It does not stop the model from adding extra text or missing braces. Use this only for prototyping.
API-level JSON mode and Function Calling Providers like OpenAI, Anthropic, and Gemini use this. They validate tokens during generation. This ensures the output matches your schema. It is the standard for most production apps. It has very low latency.
Grammar-constrained decoding This is for local or self-hosted models. Tools like Outlines or llama.cpp modify the probability of every token. If a token violates your schema, the system masks it out. The model cannot pick an invalid character. This is the most reliable method.

When to use each:

• Prompt-only: Quick scripts and testing. • API-level: Production apps using cloud models. • Grammar-constrained: Self-hosted models and sensitive data.

Key takeaways:

Token masking is better than resampling. Masking prevents errors before they happen.
Grammar compilation adds latency. Cache your schemas to save time.
Avoid constraints for creative writing. Constraints reduce diversity in text.

If you need 99.9% reliability, do not rely on prompts alone. Move your enforcement to the token level.

Source: https://dev.to/tech_nuggets/structured-output-from-llms-json-mode-function-calling-and-grammar-constrained-decoding-355d

Optional learning community: https://t.me/GyaanSetuAi

𝗦𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱 𝗢𝘂𝘁𝗽𝘂𝘁 𝗳𝗿𝗼𝗺 𝗟𝗟𝗠𝘀

Continue reading

𝗟𝗮𝗿𝗴𝗲 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗠𝗼𝗱𝗲𝗹𝘀 𝗘𝘅𝗽𝗹𝗮𝗶𝗻𝗲𝗱 𝗦𝗶𝗺𝗽𝗹𝘆

𝗙𝗶𝘅𝗶𝗻𝗴 𝗝𝗦𝗢𝗡 𝗢𝘂𝘁𝗽𝘂𝘁 𝗳𝗿𝗼𝗺 𝗚𝗣𝗧

𝗙𝗜𝗫𝗜𝗡𝗚 𝗝𝗦𝗢𝗡 𝗢𝗨𝗧𝗣𝗨𝗧 𝗙𝗥𝗢𝗠 𝗚𝗣𝗧

𝗩𝗮𝗹𝗶𝗱𝗮𝘁𝗲 𝗣𝘆𝗱𝗮𝗻𝘁𝗶𝗰 𝗦𝗰𝗵𝗲𝗺𝗮𝘀 𝗕𝗲𝗳𝗼𝗿𝗲 𝗟𝗟𝗠 𝗖𝗮𝗹𝗹𝘀

𝗜 𝗙𝗶𝘅𝗲𝗱 𝗟𝗟𝗠 𝗠𝗮𝗿𝗸𝗱𝗼𝘄𝗻 𝗘𝗿𝗿𝗼𝗿𝘀 𝘄𝗶𝘁𝗵 𝗝𝗶𝗻𝗷𝗮𝟮 𝗮𝗻𝗱 𝗔𝗦𝗧 𝗣𝗮𝗿𝘀𝗶𝗻𝗴