EnglishGuide

GPT-5 API Complete Guide: Features, Pricing, and Code Examples

Everything developers need to know about the GPT-5 API. Covers new features, pricing comparison, migration from GPT-4, and practical code examples in Python and Node.js.

Crazyrouter Team

February 23, 2026 / 449 views

GPT-5 API Complete Guide: Features, Pricing, and Code Examples

Crazyrouter

Read the docs Check live pricing Open image tool Create account

GPT-5 represents OpenAI's biggest leap since GPT-4. Longer context, better reasoning, native tool use, and multimodal capabilities that actually work in production. If you're building with the OpenAI API, here's what you need to know.

What's New in GPT-5#

Key Improvements Over GPT-4#

Feature	GPT-4o	GPT-5	GPT-5.2
Context Window	128K	256K	256K
Output Tokens	16K	32K	64K
Reasoning	Good	Excellent	Best-in-class
Tool Use	Basic	Native	Advanced
Vision	✅	✅ Enhanced	✅ Enhanced
Audio	✅	✅ Native	✅ Native
Code Generation	Good	Very Good	Excellent
Instruction Following	Good	Excellent	Excellent
Latency (TTFT)	~300ms	~400ms	~350ms

What Makes GPT-5 Different#

256K Context Window — Process entire codebases, long documents, or extended conversations without truncation
Native Tool Use — Function calling is deeply integrated, not bolted on. Fewer hallucinated tool calls, better parameter extraction
Improved Reasoning — Chain-of-thought is built into the model, not just prompted. Complex multi-step problems see 30-40% accuracy improvement
Better Code — Significant improvements in code generation, debugging, and understanding large codebases
Multimodal Native — Vision and audio aren't separate models anymore; they're part of the core architecture

Getting Started with GPT-5 API#

Python Setup#

python

from openai import OpenAI

# Direct via OpenAI
client = OpenAI(api_key="sk-your-openai-key")

# Or via Crazyrouter (recommended for cost savings)
client = OpenAI(
    api_key="your-crazyrouter-key",
    base_url="https://api.crazyrouter.com/v1"
)

Basic Chat Completion#

python

response = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {
            "role": "system",
            "content": "You are a senior software engineer. Be concise and practical."
        },
        {
            "role": "user",
            "content": "Explain the difference between async/await and Promises in JavaScript. When should I use each?"
        }
    ],
    temperature=0.7,
    max_tokens=1024
)

print(response.choices[0].message.content)

Streaming Response#

python

stream = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "user", "content": "Write a Python async web scraper with rate limiting"}
    ],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

Node.js Setup#

javascript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your-crazyrouter-key',
  baseURL: 'https://api.crazyrouter.com/v1'
});

async function chat(prompt) {
  const response = await client.chat.completions.create({
    model: 'gpt-5',
    messages: [{ role: 'user', content: prompt }],
    temperature: 0.7
  });
  return response.choices[0].message.content;
}

const answer = await chat('Design a database schema for a multi-tenant SaaS app');
console.log(answer);

Function Calling (Tool Use)#

GPT-5's function calling is significantly more reliable:

python

import json

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_products",
            "description": "Search for products in the catalog",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Search query"},
                    "category": {
                        "type": "string",
                        "enum": ["electronics", "clothing", "books", "home"],
                        "description": "Product category filter"
                    },
                    "max_price": {"type": "number", "description": "Maximum price in USD"},
                    "in_stock": {"type": "boolean", "description": "Only show in-stock items"}
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_order_status",
            "description": "Check the status of an order",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {"type": "string", "description": "The order ID"}
                },
                "required": ["order_id"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "user", "content": "Find me wireless headphones under $100 that are in stock"}
    ],
    tools=tools,
    tool_choice="auto"
)

tool_call = response.choices[0].message.tool_calls[0]
print(f"Function: {tool_call.function.name}")
print(f"Args: {tool_call.function.arguments}")
# Function: search_products
# Args: {"query": "wireless headphones", "category": "electronics", "max_price": 100, "in_stock": true}

Vision (Image Analysis)#

python

response = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this architecture diagram? List all services and their connections."},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/architecture.png"}
                }
            ]
        }
    ],
    max_tokens=2048
)

print(response.choices[0].message.content)

GPT-5 vs GPT-4o vs Claude Opus 4.5#

Performance Benchmarks#

Benchmark	GPT-5	GPT-5.2	GPT-4o	Claude Opus 4.5	Gemini 3 Pro
MMLU	90.2%	92.1%	87.5%	91.8%	89.5%
HumanEval	93.5%	95.2%	90.2%	94.1%	88.7%
MATH	78.3%	82.1%	72.6%	80.5%	76.2%
GPQA	65.8%	69.2%	58.3%	67.1%	62.4%

Practical Comparison#

Use Case	Best Model	Why
General chat	GPT-5	Best balance of quality and speed
Complex reasoning	GPT-5.2 or Claude Opus	Highest accuracy on hard problems
Code generation	GPT-5.2	Best HumanEval scores
Long documents	Claude Opus 4.5	200K context with better recall
Cost-sensitive	GPT-5-mini	90% quality at 20% cost
Real-time apps	Gemini 2.5 Flash	Lowest latency

Pricing#

Official OpenAI Pricing#

Model	Input (1M tokens)	Output (1M tokens)	Context
GPT-5	$5.00	$15.00	256K
GPT-5.2	$10.00	$30.00	256K
GPT-5-mini	$0.50	$1.50	128K
GPT-4o	$2.50	$10.00	128K

Crazyrouter Pricing (Save 20-40%)#

Model	Input (1M tokens)	Output (1M tokens)	Savings vs Official
GPT-5	$3.50	$10.50	30%
GPT-5.2	$7.00	$21.00	30%
GPT-5-mini	$0.35	$1.05	30%
GPT-4o	$1.75	$7.00	30%

Monthly Cost Estimates#

Usage Level	GPT-5 (Official)	GPT-5 (Crazyrouter)	Savings
Light (1M tokens/day)	~$600/mo	~$420/mo	$180/mo
Medium (10M tokens/day)	~$6,000/mo	~$4,200/mo	$1,800/mo
Heavy (100M tokens/day)	~$60,000/mo	~$42,000/mo	$18,000/mo

At scale, switching to Crazyrouter saves thousands per month with zero code changes — just swap the base URL.

Migration from GPT-4#

What Changes#

python

# Before (GPT-4o)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    max_tokens=4096
)

# After (GPT-5) — just change the model name
response = client.chat.completions.create(
    model="gpt-5",
    messages=[...],
    max_tokens=4096  # Can now go up to 32K
)

The API is backward compatible. In most cases, you literally just change the model string. But there are a few things to watch:

Output format may differ — GPT-5 tends to be more structured in its responses. If you're parsing output with regex, test thoroughly.
Tool calling is stricter — GPT-5 follows function schemas more precisely. Loose schemas that worked with GPT-4 might need tightening.
System prompts matter more — GPT-5 follows system instructions more faithfully. Vague prompts get vague results.
Cost increase — GPT-5 is 2x the price of GPT-4o. Consider GPT-5-mini for cost-sensitive workloads.

Migration Checklist#

Update model string to gpt-5
Test all function calling schemas
Review system prompts for clarity
Update max_tokens limits if needed
Run regression tests on output parsing
Monitor costs for the first week
Consider GPT-5-mini for non-critical paths

Best Practices#

Use GPT-5-mini for simple tasks — classification, extraction, summarization don't need the full model
Stream everything — GPT-5's TTFT is slightly higher; streaming masks the latency
Leverage the 256K context — but be strategic. Put important info at the beginning and end
Use structured outputs — response_format: { type: "json_object" } for reliable parsing
Cache aggressively — same input = same output at temperature 0. Cache it.
Batch non-urgent requests — OpenAI's batch API gives 50% discount

FAQ#

Is GPT-5 worth the upgrade from GPT-4o?#

For complex reasoning, code generation, and tool use — yes. For simple chat and classification, GPT-5-mini or even GPT-4o-mini is more cost-effective.

Can I use GPT-5 for free?#

ChatGPT Free tier includes limited GPT-5 access. For API usage, there's no free tier, but Crazyrouter offers pay-as-you-go with no minimum.

What's the difference between GPT-5 and GPT-5.2?#

GPT-5.2 is the latest iteration with improved reasoning and code generation. It costs 2x more than GPT-5. Use it when accuracy on hard problems justifies the cost.

Does GPT-5 support fine-tuning?#

Not yet for GPT-5. Fine-tuning is available for GPT-4o and GPT-4o-mini. OpenAI has indicated GPT-5 fine-tuning is coming.

How does GPT-5 handle rate limits?#

Same tier system as GPT-4. Tier 1 starts at 500 RPM. Through Crazyrouter, rate limits are pooled across providers, giving you effectively higher throughput.

Summary#

GPT-5 is a meaningful upgrade for developers building AI-powered applications. The improved reasoning, native tool use, and 256K context make it the go-to model for complex tasks. For cost-sensitive workloads, GPT-5-mini delivers most of the capability at a fraction of the price.

Get started with GPT-5 through Crazyrouter — same OpenAI SDK, 30% lower costs, and access to Claude, Gemini, and 300+ other models with the same API key.