Login
Back to Blog
"AI Structured Output Guide 2026: JSON Mode Across OpenAI, Claude, and Gemini"

"AI Structured Output Guide 2026: JSON Mode Across OpenAI, Claude, and Gemini"

C
Crazyrouter Team
April 8, 2026
279 viewsEnglishTutorial
Share:

AI Structured Output Guide 2026: JSON Mode Across OpenAI, Claude, and Gemini#

One of the most common developer pain points with LLMs is getting consistent, parseable structured output. JSON mode, structured outputs, and schema enforcement have evolved significantly — here's everything you need to know in 2026.

Why Structured Output Matters#

Without reliable JSON output, every LLM integration needs brittle regex parsing, retry logic, and constant prompt tweaking. With proper structured output:

  • Parse responses directly without text cleaning
  • Integrate LLM outputs into databases and APIs reliably
  • Build deterministic workflows on top of non-deterministic models
  • Reduce hallucinated or malformed data

The Three Approaches to Structured Output#

ApproachReliabilityFlexibilitySupport
1. JSON Mode (hint only)⭐⭐⭐HighOpenAI, Gemini, most models
2. Structured Outputs (schema-enforced)⭐⭐⭐⭐⭐MediumOpenAI GPT-5, Gemini 3
3. Prompt Engineering (no enforcement)⭐⭐HighestAll models

Approach 1: JSON Mode#

JSON mode tells the model to output valid JSON, but doesn't enforce a specific schema. It's widely supported and reliable for well-defined prompts.

OpenAI JSON Mode#

python
from openai import OpenAI

client = OpenAI(
    api_key="your-crazyrouter-key",
    base_url="https://crazyrouter.com/v1"
)

response = client.chat.completions.create(
    model="gpt-5-mini",
    response_format={"type": "json_object"},  # Enable JSON mode
    messages=[
        {
            "role": "system",
            "content": "You are a data extractor. Always respond with valid JSON."
        },
        {
            "role": "user",
            "content": """Extract the following from this text and return as JSON:
            
Text: "John Smith, senior engineer at Acme Corp, can be reached at john@acme.com or +1-555-0123."

Return: {"name": "...", "title": "...", "company": "...", "email": "...", "phone": "..."}"""
        }
    ]
)

import json
data = json.loads(response.choices[0].message.content)
print(data)
# {"name": "John Smith", "title": "senior engineer", "company": "Acme Corp", ...}

Gemini JSON Mode#

python
response = client.chat.completions.create(
    model="gemini-2.5-flash",
    response_format={"type": "json_object"},
    messages=[
        {
            "role": "user",
            "content": "List 3 Python web frameworks as JSON: [{name, stars, use_case}]"
        }
    ]
)

data = json.loads(response.choices[0].message.content)

Claude JSON Mode (via prompt)#

Claude doesn't have a native json_object response format in the API, but responds reliably with prompt engineering:

python
response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[
        {
            "role": "user",
            "content": """Analyze the sentiment of this review and return ONLY valid JSON:

Review: "The product arrived quickly but the build quality is disappointing."

Required JSON format:
{
  "sentiment": "positive|negative|mixed",
  "score": 0.0-1.0,
  "aspects": [{"aspect": "...", "sentiment": "..."}],
  "summary": "..."
}"""
        }
    ]
)

# Claude respects JSON-only instructions reliably
content = response.choices[0].message.content
# May need to strip markdown code blocks:
if content.startswith("```"):
    content = content.split("```")[1]
    if content.startswith("json"):
        content = content[4:]

data = json.loads(content.strip())

Approach 2: Structured Outputs (Schema-Enforced)#

OpenAI's Structured Outputs (available with GPT-5 series) constrain generation to match a JSON Schema exactly. This is the gold standard for reliability.

OpenAI Structured Outputs with Pydantic#

python
from openai import OpenAI
from pydantic import BaseModel
from typing import List, Optional

client = OpenAI(
    api_key="your-crazyrouter-key",
    base_url="https://crazyrouter.com/v1"
)

# Define your expected schema
class JobCandidate(BaseModel):
    name: str
    years_experience: int
    skills: List[str]
    education: str
    salary_expectation: Optional[int] = None
    available: bool

class ResumeAnalysis(BaseModel):
    candidates: List[JobCandidate]
    top_pick: str
    reasoning: str

# Parse resumes with guaranteed schema
response = client.beta.chat.completions.parse(
    model="gpt-5-2",  # Structured outputs require GPT-5 series
    messages=[
        {
            "role": "system",
            "content": "Extract candidate information from resumes."
        },
        {
            "role": "user",
            "content": f"Analyze these resumes and rank the candidates:\n{resume_text}"
        }
    ],
    response_format=ResumeAnalysis,
)

# Fully typed, validated output
analysis = response.choices[0].message.parsed
print(f"Top pick: {analysis.top_pick}")
for candidate in analysis.candidates:
    print(f"- {candidate.name}: {candidate.years_experience} years, {candidate.skills}")

Structured Outputs with Raw JSON Schema#

python
response = client.chat.completions.create(
    model="gpt-5-2",
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "product_analysis",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "product_name": {"type": "string"},
                    "category": {
                        "type": "string",
                        "enum": ["electronics", "clothing", "food", "other"]
                    },
                    "price": {"type": "number"},
                    "features": {
                        "type": "array",
                        "items": {"type": "string"}
                    },
                    "in_stock": {"type": "boolean"}
                },
                "required": ["product_name", "category", "price", "features", "in_stock"],
                "additionalProperties": False
            }
        }
    },
    messages=[
        {"role": "user", "content": "Analyze this product: iPhone 16 Pro, $999, available now, 48MP camera, titanium design"}
    ]
)

Approach 3: Provider Comparison for Reliability#

Let's be practical about which providers are most reliable for structured output:

ProviderJSON ModeSchema EnforcementReliabilityNotes
OpenAI GPT-5.2✅ (Structured Outputs)⭐⭐⭐⭐⭐Best-in-class
OpenAI GPT-5 Mini✅ (Structured Outputs)⭐⭐⭐⭐⭐Fast + reliable
Gemini 2.5 Flash✅ (responseSchema)⭐⭐⭐⭐Good for Google formats
Claude Sonnet 4.5Prompt-only❌ native⭐⭐⭐⭐Reliable with prompting
Claude Opus 4.6Prompt-only❌ native⭐⭐⭐⭐Best with complex schemas
DeepSeek V3.2Limited⭐⭐⭐Good for simple schemas
Grok 4.1 FastLimited⭐⭐⭐Improving with updates

Gemini Structured Output with Response Schema#

python
response = client.chat.completions.create(
    model="gemini-2.5-flash",
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "news_article",
            "schema": {
                "type": "object",
                "properties": {
                    "headline": {"type": "string"},
                    "topics": {"type": "array", "items": {"type": "string"}},
                    "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
                    "key_entities": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "type": {"type": "string"}
                            }
                        }
                    }
                }
            }
        }
    },
    messages=[
        {"role": "user", "content": f"Analyze this news article:\n{article_text}"}
    ]
)

Production Patterns for Reliable JSON Output#

Pattern 1: The Validator-Retry Loop#

python
import json
from pydantic import BaseModel, ValidationError
from openai import OpenAI

client = OpenAI(
    api_key="your-crazyrouter-key",
    base_url="https://crazyrouter.com/v1"
)

class ExtractedData(BaseModel):
    title: str
    author: str
    date: str
    summary: str

def extract_with_retry(text: str, max_retries: int = 3) -> ExtractedData:
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="claude-sonnet-4-5",
                messages=[
                    {
                        "role": "user",
                        "content": f"""Extract information and return ONLY valid JSON matching exactly:
{{
  "title": "article title",
  "author": "author name",  
  "date": "YYYY-MM-DD format",
  "summary": "one sentence summary"
}}

Text: {text}

Return only the JSON object, no other text."""
                    }
                ]
            )
            
            content = response.choices[0].message.content.strip()
            # Clean up potential markdown wrapping
            if "```" in content:
                content = content.split("```")[1]
                if content.startswith("json"):
                    content = content[4:]
            
            data = json.loads(content.strip())
            return ExtractedData(**data)
            
        except (json.JSONDecodeError, ValidationError) as e:
            if attempt == max_retries - 1:
                raise
            print(f"Attempt {attempt + 1} failed: {e}. Retrying...")
    
    raise ValueError("Failed to extract valid JSON after retries")

Pattern 2: Multi-Provider Fallback for Critical Data#

python
async def extract_structured_data(text: str, schema: dict) -> dict:
    """Try multiple providers for critical structured extraction."""
    
    providers = [
        ("gpt-5-2", "openai_structured"),   # Best reliability
        ("gemini-2.5-flash", "gemini"),       # Good fallback
        ("claude-sonnet-4-5", "prompt"),      # Reliable with prompting
    ]
    
    for model, method in providers:
        try:
            if method == "openai_structured":
                response = await client.chat.completions.create(
                    model=model,
                    response_format={
                        "type": "json_schema",
                        "json_schema": {"name": "extraction", "schema": schema, "strict": True}
                    },
                    messages=[{"role": "user", "content": f"Extract data from:\n{text}"}]
                )
            else:
                response = await client.chat.completions.create(
                    model=model,
                    messages=[{
                        "role": "user",
                        "content": f"Extract data and return JSON matching schema {json.dumps(schema)}:\n{text}"
                    }]
                )
            
            content = response.choices[0].message.content
            return json.loads(content)
            
        except Exception as e:
            print(f"Provider {model} failed: {e}")
            continue
    
    raise RuntimeError("All providers failed for structured extraction")

Pattern 3: Node.js with Zod Validation#

javascript
import OpenAI from 'openai';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

const client = new OpenAI({
  apiKey: process.env.CRAZYROUTER_API_KEY,
  baseURL: 'https://crazyrouter.com/v1',
});

// Define schema with Zod
const ProductSchema = z.object({
  name: z.string(),
  price: z.number().positive(),
  category: z.enum(['electronics', 'clothing', 'food', 'other']),
  inStock: z.boolean(),
  features: z.array(z.string()).min(1),
});

async function extractProduct(description) {
  const jsonSchema = zodToJsonSchema(ProductSchema, 'product');
  
  const response = await client.chat.completions.create({
    model: 'gpt-5-mini',
    response_format: {
      type: 'json_schema',
      json_schema: {
        name: 'product',
        strict: true,
        schema: jsonSchema.definitions.product,
      },
    },
    messages: [
      { role: 'user', content: `Extract product data from: "${description}"` }
    ],
  });
  
  const rawData = JSON.parse(response.choices[0].message.content);
  
  // Validate with Zod
  return ProductSchema.parse(rawData);
}

// Usage
const product = await extractProduct(
  'The Sony WH-1000XM6 headphones cost $299, feature noise cancellation and 30hr battery, in stock.'
);
console.log(product);

Best Practices for Structured Output Prompts#

1. Always show an example in the prompt#

python
# Bad: Vague instruction
"Return the data as JSON"

# Good: Show exact expected format
"""Return exactly this JSON structure (no other text):
{
  "status": "success|error|pending",
  "message": "human readable description",
  "data": {"key": "value"}
}"""

2. For Claude: Request JSON inside the assistant turn#

python
messages = [
    {"role": "user", "content": "Classify this email: " + email_text},
    {"role": "assistant", "content": "{"}  # Prime the response
]
# Claude will continue the JSON you started

3. Keep schemas simple for less capable models#

Complex nested schemas work well with GPT-5 series and Claude Opus. For faster/cheaper models, flatten the schema:

python
# For faster models (Haiku, Flash Lite)
simple_schema = {
    "sentiment": "positive|negative|neutral",
    "confidence": 0.95,
    "reason": "brief explanation"
}

# For powerful models (Opus, GPT-5.2)
complex_schema = {
    "sentiment": {
        "overall": "positive|negative|neutral",
        "aspects": [{"name": "...", "sentiment": "...", "keywords": []}],
        "confidence": 0.95
    },
    "entities": [{"name": "...", "type": "PERSON|ORG|PRODUCT"}],
    "topics": [],
    "actionItems": []
}

Frequently Asked Questions#

Q: Which provider is most reliable for JSON output? A: OpenAI's Structured Outputs (GPT-5 series) offer the highest reliability with schema enforcement. Claude Opus and Sonnet are highly reliable with prompt engineering. All are accessible via Crazyrouter.

Q: Does Claude support structured outputs natively? A: As of April 2026, Claude does not support json_schema response format natively. However, Claude is highly reliable with well-crafted prompts and the "prime the response" technique.

Q: What's the difference between JSON mode and structured outputs? A: JSON mode hints the model to return valid JSON but doesn't enforce a schema. Structured outputs constrain generation to match your exact schema — zero invalid outputs.

Q: Can I use structured outputs with streaming? A: Yes, with OpenAI's API. You stream chunks and assemble the JSON at the end. Partial JSON parsing is also possible for progressive UI updates.

Q: What models support structured outputs via Crazyrouter? A: All OpenAI GPT-5 series models support full structured outputs. Gemini 2.5+ supports responseSchema. Claude uses prompt-based JSON. All available at crazyrouter.com.

Summary#

In 2026, structured output reliability has dramatically improved:

  • Best reliability: OpenAI Structured Outputs (GPT-5 series) → guaranteed schema compliance
  • Good balance: Gemini 2.5 Flash with responseSchema
  • Reliable with prompting: Claude Sonnet/Opus with clear examples
  • For production: Use validator-retry patterns as a safety net

Access all these models through a single API at Crazyrouter — no need to manage multiple API keys for different providers.

Start building with structured AI outputs at Crazyrouter

Implementation Guides

Related Posts

"Function Calling Across AI Providers: A Unified Implementation Guide"Tutorial

"Function Calling Across AI Providers: A Unified Implementation Guide"

Learn how to implement function calling (tool use) across OpenAI, Claude, Gemini, and other AI providers. Unified patterns with Python and Node.js examples.

Feb 20
AI API Pricing Comparison 2026: OpenAI vs Anthropic vs GoogleTutorial

AI API Pricing Comparison 2026: OpenAI vs Anthropic vs Google

Choosing the right AI API can save you thousands of dollars per year. This comprehensive comparison breaks down pricing for GPT-4, Claude, Gemini

Jan 26
"Qwen2.5-Omni Complete Guide: Alibaba's Multimodal AI Model for Developers"Tutorial

"Qwen2.5-Omni Complete Guide: Alibaba's Multimodal AI Model for Developers"

"Complete developer guide to Qwen2.5-Omni — Alibaba's multimodal AI model that processes text, images, audio, and video. Includes API setup, code examples, and pricing."

Feb 19
"AI Agent Memory Patterns: Building Stateful AI Applications with Long-Term Memory in 2026"Tutorial

"AI Agent Memory Patterns: Building Stateful AI Applications with Long-Term Memory in 2026"

"Learn how to implement memory patterns for AI agents. Covers conversation buffers, sliding windows, summary memory, vector-based retrieval, and hybrid approaches using GPT-5, Claude, and open-source tools."

Mar 13
Lip Sync API for Developers 2026: Best Architecture, Pricing, and AlternativesTutorial

Lip Sync API for Developers 2026: Best Architecture, Pricing, and Alternatives

A developer guide to lip sync APIs in 2026, covering what they do, how they compare, integration patterns, pricing models, and production best practices.

Mar 17
Claude Opus 4.7 vs Opus 4.6: 7 Real-World Benchmarks via CrazyrouterTutorial

Claude Opus 4.7 vs Opus 4.6: 7 Real-World Benchmarks via Crazyrouter

We benchmarked Claude Opus 4.7 against Opus 4.6 on 7 tasks through Crazyrouter: coding, debugging, math, writing, translation, context, and reasoning.

Apr 16