"AI Structured Output Guide 2026: JSON Mode Across OpenAI, Claude, and Gemini"

AI Structured Output Guide 2026: JSON Mode Across OpenAI, Claude, and Gemini#

One of the most common developer pain points with LLMs is getting consistent, parseable structured output. JSON mode, structured outputs, and schema enforcement have evolved significantly — here's everything you need to know in 2026.

Why Structured Output Matters#

Without reliable JSON output, every LLM integration needs brittle regex parsing, retry logic, and constant prompt tweaking. With proper structured output:

Parse responses directly without text cleaning
Integrate LLM outputs into databases and APIs reliably
Build deterministic workflows on top of non-deterministic models
Reduce hallucinated or malformed data

The Three Approaches to Structured Output#

Approach	Reliability	Flexibility	Support
1. JSON Mode (hint only)	⭐⭐⭐	High	OpenAI, Gemini, most models
2. Structured Outputs (schema-enforced)	⭐⭐⭐⭐⭐	Medium	OpenAI GPT-5, Gemini 3
3. Prompt Engineering (no enforcement)	⭐⭐	Highest	All models

Approach 1: JSON Mode#

JSON mode tells the model to output valid JSON, but doesn't enforce a specific schema. It's widely supported and reliable for well-defined prompts.

OpenAI JSON Mode#

python

from openai import OpenAI

client = OpenAI(
    api_key="your-crazyrouter-key",
    base_url="https://crazyrouter.com/v1"
)

response = client.chat.completions.create(
    model="gpt-5-mini",
    response_format={"type": "json_object"},  # Enable JSON mode
    messages=[
        {
            "role": "system",
            "content": "You are a data extractor. Always respond with valid JSON."
        },
        {
            "role": "user",
            "content": """Extract the following from this text and return as JSON:
            
Text: "John Smith, senior engineer at Acme Corp, can be reached at john@acme.com or +1-555-0123."

Return: {"name": "...", "title": "...", "company": "...", "email": "...", "phone": "..."}"""
        }
    ]
)

import json
data = json.loads(response.choices[0].message.content)
print(data)
# {"name": "John Smith", "title": "senior engineer", "company": "Acme Corp", ...}

Gemini JSON Mode#

python

response = client.chat.completions.create(
    model="gemini-2.5-flash",
    response_format={"type": "json_object"},
    messages=[
        {
            "role": "user",
            "content": "List 3 Python web frameworks as JSON: [{name, stars, use_case}]"
        }
    ]
)

data = json.loads(response.choices[0].message.content)

Claude JSON Mode (via prompt)#

Claude doesn't have a native json_object response format in the API, but responds reliably with prompt engineering:

python

response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[
        {
            "role": "user",
            "content": """Analyze the sentiment of this review and return ONLY valid JSON:

Review: "The product arrived quickly but the build quality is disappointing."

Required JSON format:
{
  "sentiment": "positive|negative|mixed",
  "score": 0.0-1.0,
  "aspects": [{"aspect": "...", "sentiment": "..."}],
  "summary": "..."
}"""
        }
    ]
)

# Claude respects JSON-only instructions reliably
content = response.choices[0].message.content
# May need to strip markdown code blocks:
if content.startswith("```"):
    content = content.split("```")[1]
    if content.startswith("json"):
        content = content[4:]

data = json.loads(content.strip())

Approach 2: Structured Outputs (Schema-Enforced)#

OpenAI's Structured Outputs (available with GPT-5 series) constrain generation to match a JSON Schema exactly. This is the gold standard for reliability.

OpenAI Structured Outputs with Pydantic#

python

from openai import OpenAI
from pydantic import BaseModel
from typing import List, Optional

client = OpenAI(
    api_key="your-crazyrouter-key",
    base_url="https://crazyrouter.com/v1"
)

# Define your expected schema
class JobCandidate(BaseModel):
    name: str
    years_experience: int
    skills: List[str]
    education: str
    salary_expectation: Optional[int] = None
    available: bool

class ResumeAnalysis(BaseModel):
    candidates: List[JobCandidate]
    top_pick: str
    reasoning: str

# Parse resumes with guaranteed schema
response = client.beta.chat.completions.parse(
    model="gpt-5-2",  # Structured outputs require GPT-5 series
    messages=[
        {
            "role": "system",
            "content": "Extract candidate information from resumes."
        },
        {
            "role": "user",
            "content": f"Analyze these resumes and rank the candidates:\n{resume_text}"
        }
    ],
    response_format=ResumeAnalysis,
)

# Fully typed, validated output
analysis = response.choices[0].message.parsed
print(f"Top pick: {analysis.top_pick}")
for candidate in analysis.candidates:
    print(f"- {candidate.name}: {candidate.years_experience} years, {candidate.skills}")

Structured Outputs with Raw JSON Schema#

python

response = client.chat.completions.create(
    model="gpt-5-2",
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "product_analysis",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "product_name": {"type": "string"},
                    "category": {
                        "type": "string",
                        "enum": ["electronics", "clothing", "food", "other"]
                    },
                    "price": {"type": "number"},
                    "features": {
                        "type": "array",
                        "items": {"type": "string"}
                    },
                    "in_stock": {"type": "boolean"}
                },
                "required": ["product_name", "category", "price", "features", "in_stock"],
                "additionalProperties": False
            }
        }
    },
    messages=[
        {"role": "user", "content": "Analyze this product: iPhone 16 Pro, $999, available now, 48MP camera, titanium design"}
    ]
)

Approach 3: Provider Comparison for Reliability#

Let's be practical about which providers are most reliable for structured output:

Provider	JSON Mode	Schema Enforcement	Reliability	Notes
OpenAI GPT-5.2	✅	✅ (Structured Outputs)	⭐⭐⭐⭐⭐	Best-in-class
OpenAI GPT-5 Mini	✅	✅ (Structured Outputs)	⭐⭐⭐⭐⭐	Fast + reliable
Gemini 2.5 Flash	✅	✅ (responseSchema)	⭐⭐⭐⭐	Good for Google formats
Claude Sonnet 4.5	Prompt-only	❌ native	⭐⭐⭐⭐	Reliable with prompting
Claude Opus 4.6	Prompt-only	❌ native	⭐⭐⭐⭐	Best with complex schemas
DeepSeek V3.2	✅	Limited	⭐⭐⭐	Good for simple schemas
Grok 4.1 Fast	✅	Limited	⭐⭐⭐	Improving with updates

Gemini Structured Output with Response Schema#

python

response = client.chat.completions.create(
    model="gemini-2.5-flash",
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "news_article",
            "schema": {
                "type": "object",
                "properties": {
                    "headline": {"type": "string"},
                    "topics": {"type": "array", "items": {"type": "string"}},
                    "sentiment": {"type": "string", "enum": ["positive", "negative", "neutral"]},
                    "key_entities": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {"type": "string"},
                                "type": {"type": "string"}
                            }
                        }
                    }
                }
            }
        }
    },
    messages=[
        {"role": "user", "content": f"Analyze this news article:\n{article_text}"}
    ]
)

Production Patterns for Reliable JSON Output#

Pattern 1: The Validator-Retry Loop#

python

import json
from pydantic import BaseModel, ValidationError
from openai import OpenAI

client = OpenAI(
    api_key="your-crazyrouter-key",
    base_url="https://crazyrouter.com/v1"
)

class ExtractedData(BaseModel):
    title: str
    author: str
    date: str
    summary: str

def extract_with_retry(text: str, max_retries: int = 3) -> ExtractedData:
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="claude-sonnet-4-5",
                messages=[
                    {
                        "role": "user",
                        "content": f"""Extract information and return ONLY valid JSON matching exactly:
{{
  "title": "article title",
  "author": "author name",  
  "date": "YYYY-MM-DD format",
  "summary": "one sentence summary"
}}

Text: {text}

Return only the JSON object, no other text."""
                    }
                ]
            )
            
            content = response.choices[0].message.content.strip()
            # Clean up potential markdown wrapping
            if "```" in content:
                content = content.split("```")[1]
                if content.startswith("json"):
                    content = content[4:]
            
            data = json.loads(content.strip())
            return ExtractedData(**data)
            
        except (json.JSONDecodeError, ValidationError) as e:
            if attempt == max_retries - 1:
                raise
            print(f"Attempt {attempt + 1} failed: {e}. Retrying...")
    
    raise ValueError("Failed to extract valid JSON after retries")

Pattern 2: Multi-Provider Fallback for Critical Data#

python

async def extract_structured_data(text: str, schema: dict) -> dict:
    """Try multiple providers for critical structured extraction."""
    
    providers = [
        ("gpt-5-2", "openai_structured"),   # Best reliability
        ("gemini-2.5-flash", "gemini"),       # Good fallback
        ("claude-sonnet-4-5", "prompt"),      # Reliable with prompting
    ]
    
    for model, method in providers:
        try:
            if method == "openai_structured":
                response = await client.chat.completions.create(
                    model=model,
                    response_format={
                        "type": "json_schema",
                        "json_schema": {"name": "extraction", "schema": schema, "strict": True}
                    },
                    messages=[{"role": "user", "content": f"Extract data from:\n{text}"}]
                )
            else:
                response = await client.chat.completions.create(
                    model=model,
                    messages=[{
                        "role": "user",
                        "content": f"Extract data and return JSON matching schema {json.dumps(schema)}:\n{text}"
                    }]
                )
            
            content = response.choices[0].message.content
            return json.loads(content)
            
        except Exception as e:
            print(f"Provider {model} failed: {e}")
            continue
    
    raise RuntimeError("All providers failed for structured extraction")

Pattern 3: Node.js with Zod Validation#

javascript

import OpenAI from 'openai';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

const client = new OpenAI({
  apiKey: process.env.CRAZYROUTER_API_KEY,
  baseURL: 'https://crazyrouter.com/v1',
});

// Define schema with Zod
const ProductSchema = z.object({
  name: z.string(),
  price: z.number().positive(),
  category: z.enum(['electronics', 'clothing', 'food', 'other']),
  inStock: z.boolean(),
  features: z.array(z.string()).min(1),
});

async function extractProduct(description) {
  const jsonSchema = zodToJsonSchema(ProductSchema, 'product');
  
  const response = await client.chat.completions.create({
    model: 'gpt-5-mini',
    response_format: {
      type: 'json_schema',
      json_schema: {
        name: 'product',
        strict: true,
        schema: jsonSchema.definitions.product,
      },
    },
    messages: [
      { role: 'user', content: `Extract product data from: "${description}"` }
    ],
  });
  
  const rawData = JSON.parse(response.choices[0].message.content);
  
  // Validate with Zod
  return ProductSchema.parse(rawData);
}

// Usage
const product = await extractProduct(
  'The Sony WH-1000XM6 headphones cost $299, feature noise cancellation and 30hr battery, in stock.'
);
console.log(product);

Best Practices for Structured Output Prompts#

1. Always show an example in the prompt#

python

# Bad: Vague instruction
"Return the data as JSON"

# Good: Show exact expected format
"""Return exactly this JSON structure (no other text):
{
  "status": "success|error|pending",
  "message": "human readable description",
  "data": {"key": "value"}
}"""

2. For Claude: Request JSON inside the assistant turn#

python

messages = [
    {"role": "user", "content": "Classify this email: " + email_text},
    {"role": "assistant", "content": "{"}  # Prime the response
]
# Claude will continue the JSON you started

3. Keep schemas simple for less capable models#

Complex nested schemas work well with GPT-5 series and Claude Opus. For faster/cheaper models, flatten the schema:

python

# For faster models (Haiku, Flash Lite)
simple_schema = {
    "sentiment": "positive|negative|neutral",
    "confidence": 0.95,
    "reason": "brief explanation"
}

# For powerful models (Opus, GPT-5.2)
complex_schema = {
    "sentiment": {
        "overall": "positive|negative|neutral",
        "aspects": [{"name": "...", "sentiment": "...", "keywords": []}],
        "confidence": 0.95
    },
    "entities": [{"name": "...", "type": "PERSON|ORG|PRODUCT"}],
    "topics": [],
    "actionItems": []
}

Frequently Asked Questions#

Q: Which provider is most reliable for JSON output? A: OpenAI's Structured Outputs (GPT-5 series) offer the highest reliability with schema enforcement. Claude Opus and Sonnet are highly reliable with prompt engineering. All are accessible via Crazyrouter.

Q: Does Claude support structured outputs natively? A: As of April 2026, Claude does not support json_schema response format natively. However, Claude is highly reliable with well-crafted prompts and the "prime the response" technique.

Q: What's the difference between JSON mode and structured outputs? A: JSON mode hints the model to return valid JSON but doesn't enforce a schema. Structured outputs constrain generation to match your exact schema — zero invalid outputs.

Q: Can I use structured outputs with streaming? A: Yes, with OpenAI's API. You stream chunks and assemble the JSON at the end. Partial JSON parsing is also possible for progressive UI updates.

Q: What models support structured outputs via Crazyrouter? A: All OpenAI GPT-5 series models support full structured outputs. Gemini 2.5+ supports responseSchema. Claude uses prompt-based JSON. All available at crazyrouter.com.

Summary#

In 2026, structured output reliability has dramatically improved:

Best reliability: OpenAI Structured Outputs (GPT-5 series) → guaranteed schema compliance
Good balance: Gemini 2.5 Flash with responseSchema
Reliable with prompting: Claude Sonnet/Opus with clear examples
For production: Use validator-retry patterns as a safety net

Access all these models through a single API at Crazyrouter — no need to manage multiple API keys for different providers.

→ Start building with structured AI outputs at Crazyrouter