Login
Back to Blog
"GPT-5 API Complete Guide: Features, Pricing, and Code Examples"

"GPT-5 API Complete Guide: Features, Pricing, and Code Examples"

C
Crazyrouter Team
February 23, 2026
29 viewsEnglishGuide
Share:

GPT-5 represents OpenAI's biggest leap since GPT-4. Longer context, better reasoning, native tool use, and multimodal capabilities that actually work in production. If you're building with the OpenAI API, here's what you need to know.

What's New in GPT-5#

Key Improvements Over GPT-4#

FeatureGPT-4oGPT-5GPT-5.2
Context Window128K256K256K
Output Tokens16K32K64K
ReasoningGoodExcellentBest-in-class
Tool UseBasicNativeAdvanced
Vision✅ Enhanced✅ Enhanced
Audio✅ Native✅ Native
Code GenerationGoodVery GoodExcellent
Instruction FollowingGoodExcellentExcellent
Latency (TTFT)~300ms~400ms~350ms

What Makes GPT-5 Different#

  1. 256K Context Window — Process entire codebases, long documents, or extended conversations without truncation
  2. Native Tool Use — Function calling is deeply integrated, not bolted on. Fewer hallucinated tool calls, better parameter extraction
  3. Improved Reasoning — Chain-of-thought is built into the model, not just prompted. Complex multi-step problems see 30-40% accuracy improvement
  4. Better Code — Significant improvements in code generation, debugging, and understanding large codebases
  5. Multimodal Native — Vision and audio aren't separate models anymore; they're part of the core architecture

Getting Started with GPT-5 API#

Python Setup#

python
from openai import OpenAI

# Direct via OpenAI
client = OpenAI(api_key="sk-your-openai-key")

# Or via Crazyrouter (recommended for cost savings)
client = OpenAI(
    api_key="your-crazyrouter-key",
    base_url="https://api.crazyrouter.com/v1"
)

Basic Chat Completion#

python
response = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {
            "role": "system",
            "content": "You are a senior software engineer. Be concise and practical."
        },
        {
            "role": "user",
            "content": "Explain the difference between async/await and Promises in JavaScript. When should I use each?"
        }
    ],
    temperature=0.7,
    max_tokens=1024
)

print(response.choices[0].message.content)

Streaming Response#

python
stream = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "user", "content": "Write a Python async web scraper with rate limiting"}
    ],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

Node.js Setup#

javascript
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your-crazyrouter-key',
  baseURL: 'https://api.crazyrouter.com/v1'
});

async function chat(prompt) {
  const response = await client.chat.completions.create({
    model: 'gpt-5',
    messages: [{ role: 'user', content: prompt }],
    temperature: 0.7
  });
  return response.choices[0].message.content;
}

const answer = await chat('Design a database schema for a multi-tenant SaaS app');
console.log(answer);

Function Calling (Tool Use)#

GPT-5's function calling is significantly more reliable:

python
import json

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_products",
            "description": "Search for products in the catalog",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "Search query"},
                    "category": {
                        "type": "string",
                        "enum": ["electronics", "clothing", "books", "home"],
                        "description": "Product category filter"
                    },
                    "max_price": {"type": "number", "description": "Maximum price in USD"},
                    "in_stock": {"type": "boolean", "description": "Only show in-stock items"}
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_order_status",
            "description": "Check the status of an order",
            "parameters": {
                "type": "object",
                "properties": {
                    "order_id": {"type": "string", "description": "The order ID"}
                },
                "required": ["order_id"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "user", "content": "Find me wireless headphones under $100 that are in stock"}
    ],
    tools=tools,
    tool_choice="auto"
)

tool_call = response.choices[0].message.tool_calls[0]
print(f"Function: {tool_call.function.name}")
print(f"Args: {tool_call.function.arguments}")
# Function: search_products
# Args: {"query": "wireless headphones", "category": "electronics", "max_price": 100, "in_stock": true}

Vision (Image Analysis)#

python
response = client.chat.completions.create(
    model="gpt-5",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this architecture diagram? List all services and their connections."},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/architecture.png"}
                }
            ]
        }
    ],
    max_tokens=2048
)

print(response.choices[0].message.content)

GPT-5 vs GPT-4o vs Claude Opus 4.5#

Performance Benchmarks#

BenchmarkGPT-5GPT-5.2GPT-4oClaude Opus 4.5Gemini 3 Pro
MMLU90.2%92.1%87.5%91.8%89.5%
HumanEval93.5%95.2%90.2%94.1%88.7%
MATH78.3%82.1%72.6%80.5%76.2%
GPQA65.8%69.2%58.3%67.1%62.4%

Practical Comparison#

Use CaseBest ModelWhy
General chatGPT-5Best balance of quality and speed
Complex reasoningGPT-5.2 or Claude OpusHighest accuracy on hard problems
Code generationGPT-5.2Best HumanEval scores
Long documentsClaude Opus 4.5200K context with better recall
Cost-sensitiveGPT-5-mini90% quality at 20% cost
Real-time appsGemini 2.5 FlashLowest latency

Pricing#

Official OpenAI Pricing#

ModelInput (1M tokens)Output (1M tokens)Context
GPT-5$5.00$15.00256K
GPT-5.2$10.00$30.00256K
GPT-5-mini$0.50$1.50128K
GPT-4o$2.50$10.00128K

Crazyrouter Pricing (Save 20-40%)#

ModelInput (1M tokens)Output (1M tokens)Savings vs Official
GPT-5$3.50$10.5030%
GPT-5.2$7.00$21.0030%
GPT-5-mini$0.35$1.0530%
GPT-4o$1.75$7.0030%

Monthly Cost Estimates#

Usage LevelGPT-5 (Official)GPT-5 (Crazyrouter)Savings
Light (1M tokens/day)~$600/mo~$420/mo$180/mo
Medium (10M tokens/day)~$6,000/mo~$4,200/mo$1,800/mo
Heavy (100M tokens/day)~$60,000/mo~$42,000/mo$18,000/mo

At scale, switching to Crazyrouter saves thousands per month with zero code changes — just swap the base URL.

Migration from GPT-4#

What Changes#

python
# Before (GPT-4o)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    max_tokens=4096
)

# After (GPT-5) — just change the model name
response = client.chat.completions.create(
    model="gpt-5",
    messages=[...],
    max_tokens=4096  # Can now go up to 32K
)

The API is backward compatible. In most cases, you literally just change the model string. But there are a few things to watch:

  1. Output format may differ — GPT-5 tends to be more structured in its responses. If you're parsing output with regex, test thoroughly.
  2. Tool calling is stricter — GPT-5 follows function schemas more precisely. Loose schemas that worked with GPT-4 might need tightening.
  3. System prompts matter more — GPT-5 follows system instructions more faithfully. Vague prompts get vague results.
  4. Cost increase — GPT-5 is 2x the price of GPT-4o. Consider GPT-5-mini for cost-sensitive workloads.

Migration Checklist#

  • Update model string to gpt-5
  • Test all function calling schemas
  • Review system prompts for clarity
  • Update max_tokens limits if needed
  • Run regression tests on output parsing
  • Monitor costs for the first week
  • Consider GPT-5-mini for non-critical paths

Best Practices#

  1. Use GPT-5-mini for simple tasks — classification, extraction, summarization don't need the full model
  2. Stream everything — GPT-5's TTFT is slightly higher; streaming masks the latency
  3. Leverage the 256K context — but be strategic. Put important info at the beginning and end
  4. Use structured outputsresponse_format: { type: "json_object" } for reliable parsing
  5. Cache aggressively — same input = same output at temperature 0. Cache it.
  6. Batch non-urgent requests — OpenAI's batch API gives 50% discount

FAQ#

Is GPT-5 worth the upgrade from GPT-4o?#

For complex reasoning, code generation, and tool use — yes. For simple chat and classification, GPT-5-mini or even GPT-4o-mini is more cost-effective.

Can I use GPT-5 for free?#

ChatGPT Free tier includes limited GPT-5 access. For API usage, there's no free tier, but Crazyrouter offers pay-as-you-go with no minimum.

What's the difference between GPT-5 and GPT-5.2?#

GPT-5.2 is the latest iteration with improved reasoning and code generation. It costs 2x more than GPT-5. Use it when accuracy on hard problems justifies the cost.

Does GPT-5 support fine-tuning?#

Not yet for GPT-5. Fine-tuning is available for GPT-4o and GPT-4o-mini. OpenAI has indicated GPT-5 fine-tuning is coming.

How does GPT-5 handle rate limits?#

Same tier system as GPT-4. Tier 1 starts at 500 RPM. Through Crazyrouter, rate limits are pooled across providers, giving you effectively higher throughput.

Summary#

GPT-5 is a meaningful upgrade for developers building AI-powered applications. The improved reasoning, native tool use, and 256K context make it the go-to model for complex tasks. For cost-sensitive workloads, GPT-5-mini delivers most of the capability at a fraction of the price.

Get started with GPT-5 through Crazyrouter — same OpenAI SDK, 30% lower costs, and access to Claude, Gemini, and 300+ other models with the same API key.

Related Articles