Ollama Structured Outputs In Practice

AI-assisted draft.

Running local LLMs often leads to JSON parsing errors. You tell the model to return JSON only. It still adds markdown fences or extra text. This breaks your code.

Since Ollama 0.3.0, you can fix this using the format parameter. This forces the model to follow a JSON schema. It makes it physically impossible for the model to add extra text or markdown.

I tested this with Gemma4 and Ollama 0.30.7. Here are the results.

The Problem with Natural Text Models are trained for conversation. They want to say "Here is your JSON." Even with strict prompts, they often wrap responses in code blocks. Python's json.loads() fails when it hits those blocks.

The Speed Advantage Using the format parameter is much faster.

Without structured output: 32 seconds
With structured output: 5 seconds

This is a 6.4x speed improvement. The model does not waste time deciding how to format the text. It only generates tokens that fit your schema.

Using Pydantic for Type Safety You do not need to write JSON schemas by hand. Use Pydantic models to generate them automatically.

Define your Pydantic model.
Use model_json_schema() to create the schema.
Pass that schema to Ollama.
Use model_validate_json() to parse and validate the data at once.

This approach is perfect for AI agents. You can use it to decide which tool an agent should call next. If the model tries to invent a tool name that does not exist, Pydantic catches it immediately.

Current Limitations

Deeply nested schemas can sometimes return empty arrays in smaller models.
Optional fields might return empty strings instead of null.
Large schemas use more of your context window.

Best Practices

Use simple extraction for small models.
Use Pydantic for validation and agent tool selection.
Use larger models for complex, nested data.
Add retry logic when Pydantic throws a validation error.

Stop hoping your prompts work. Use structured outputs to make your local LLM pipelines reliable.

Source: https://dev.to/jangwook_kim_e31e7291ad98/ollama-structured-outputs-in-practice-getting-type-safe-json-from-local-llms-with-pydantic-m38

Optional learning community: https://t.me/GyaanSetuAi

Ollama Structured Outputs In Practice

Continue reading

Building a Safe Local AI Coding Agent with Node.js