EnglishGuide

AI API Token Cost Calculator: How to Estimate and Optimize Your AI Spending

"Learn how to calculate AI API costs, estimate token usage, and optimize spending across GPT-5, Claude, Gemini, and other models. Includes a practical cost calculator approach."

Crazyrouter Team

February 26, 2026 / 479 views

AI API Token Cost Calculator: How to Estimate and Optimize Your AI Spending

Crazyrouter

Check live pricing Read the docs Open image tool Create account

AI API costs can spiral quickly if you're not tracking token usage carefully. Whether you're building a chatbot, coding assistant, or document processing pipeline, understanding how tokens translate to dollars is essential for budgeting and profitability.

This guide covers everything you need to know about calculating AI API costs — from token counting basics to advanced optimization strategies that can cut your bill by 50% or more.

What Are Tokens and How Are They Counted?#

Tokens are the fundamental unit of text that AI models process. They're not exactly words — they're subword units that the model's tokenizer produces.

Token Rules of Thumb#

Language	Approximate Ratio
English	1 token ≈ 0.75 words
Chinese	1 token ≈ 0.5-1 character
Code	1 token ≈ 3-4 characters
JSON	Higher token density (brackets, keys)

Quick Estimates#

Content Type	~Words	~Tokens
Short prompt	50	67
Email	200	267
Blog post	1,000	1,333
Technical doc	5,000	6,667
Book chapter	10,000	13,333
Full codebase	50,000	75,000+

AI API Pricing Comparison 2026#

Text Models (per 1M tokens)#

Model	Input	Output	Cached Input
GPT-5.2	$10.00	$30.00	$2.50
GPT-5-mini	$0.40	$1.60	$0.10
Claude Opus 4.6	$15.00	$75.00	$3.75
Claude Sonnet 4.5	$3.00	$15.00	$0.75
Claude Haiku 4.5	$0.25	$1.25	$0.06
Gemini 3 Pro	$7.00	$21.00	$1.75
Gemini 2.5 Flash	$0.15	$0.60	$0.04
DeepSeek V3.2	$0.27	$1.10	$0.07
Grok 4.1 Fast	$3.00	$15.00	—

Crazyrouter Pricing (20-30% Savings)#

Model	Input	Output	Savings
GPT-5.2	$7.00	$21.00	30%
Claude Opus 4.6	$10.50	$52.50	30%
Claude Sonnet 4.5	$2.10	$10.50	30%
Gemini 3 Pro	$5.60	$16.80	20%
DeepSeek V3.2	$0.19	$0.77	30%

Access all models through Crazyrouter with a single API key.

How to Calculate Your API Costs#

The Basic Formula#

code

Cost = (Input Tokens × Input Price) + (Output Tokens × Output Price)

Python Cost Calculator#

python

# AI API Cost Calculator
MODEL_PRICING = {
    "gpt-5.2": {"input": 10.0, "output": 30.0},
    "gpt-5-mini": {"input": 0.4, "output": 1.6},
    "claude-opus-4-6": {"input": 15.0, "output": 75.0},
    "claude-sonnet-4-5": {"input": 3.0, "output": 15.0},
    "claude-haiku-4-5": {"input": 0.25, "output": 1.25},
    "gemini-3-pro": {"input": 7.0, "output": 21.0},
    "gemini-2.5-flash": {"input": 0.15, "output": 0.60},
    "deepseek-v3.2": {"input": 0.27, "output": 1.10},
}

# Crazyrouter discount rates
CRAZYROUTER_DISCOUNT = {
    "gpt-5.2": 0.30,
    "claude-opus-4-6": 0.30,
    "claude-sonnet-4-5": 0.30,
    "gemini-3-pro": 0.20,
    "deepseek-v3.2": 0.30,
}

def calculate_cost(model: str, input_tokens: int, output_tokens: int, 
                   use_crazyrouter: bool = False) -> dict:
    """Calculate API cost for a given model and token usage."""
    pricing = MODEL_PRICING[model]
    
    input_cost = (input_tokens / 1_000_000) * pricing["input"]
    output_cost = (output_tokens / 1_000_000) * pricing["output"]
    total = input_cost + output_cost
    
    result = {
        "model": model,
        "input_tokens": input_tokens,
        "output_tokens": output_tokens,
        "input_cost": round(input_cost, 6),
        "output_cost": round(output_cost, 6),
        "total_cost": round(total, 6),
    }
    
    if use_crazyrouter and model in CRAZYROUTER_DISCOUNT:
        discount = CRAZYROUTER_DISCOUNT[model]
        cr_total = total * (1 - discount)
        result["crazyrouter_cost"] = round(cr_total, 6)
        result["savings"] = round(total - cr_total, 6)
    
    return result

# Example: Calculate cost for a coding assistant session
session = calculate_cost(
    model="claude-opus-4-6",
    input_tokens=50_000,   # ~37K words of context
    output_tokens=10_000,  # ~7.5K words of output
    use_crazyrouter=True
)

print(f"Official cost: ${session['total_cost']:.4f}")
print(f"Crazyrouter cost: ${session['crazyrouter_cost']:.4f}")
print(f"Savings: ${session['savings']:.4f}")
# Official cost: $1.5000
# Crazyrouter cost: $1.0500
# Savings: $0.4500

Monthly Cost Estimator#

python

def estimate_monthly_cost(model: str, requests_per_day: int,
                          avg_input_tokens: int, avg_output_tokens: int,
                          use_crazyrouter: bool = False) -> dict:
    """Estimate monthly API costs."""
    daily_requests = requests_per_day
    monthly_requests = daily_requests * 30
    
    total_input = monthly_requests * avg_input_tokens
    total_output = monthly_requests * avg_output_tokens
    
    result = calculate_cost(model, total_input, total_output, use_crazyrouter)
    result["monthly_requests"] = monthly_requests
    result["total_input_tokens"] = total_input
    result["total_output_tokens"] = total_output
    
    return result

# Estimate for a SaaS product with 1000 daily API calls
estimate = estimate_monthly_cost(
    model="claude-sonnet-4-5",
    requests_per_day=1000,
    avg_input_tokens=2000,
    avg_output_tokens=500,
    use_crazyrouter=True
)

print(f"Monthly requests: {estimate['monthly_requests']:,}")
print(f"Official monthly cost: ${estimate['total_cost']:.2f}")
print(f"Crazyrouter monthly cost: ${estimate['crazyrouter_cost']:.2f}")
print(f"Monthly savings: ${estimate['savings']:.2f}")
# Monthly requests: 30,000
# Official monthly cost: $405.00
# Crazyrouter monthly cost: $283.50
# Monthly savings: $121.50

7 Strategies to Optimize AI API Costs#

1. Model Routing — Use the Right Model for Each Task#

Not every request needs a frontier model. Route simple tasks to cheaper models:

python

def smart_route(task_complexity: str, messages: list) -> str:
    """Route to the most cost-effective model based on task complexity."""
    routing_map = {
        "simple": "gemini-2.5-flash",      # $0.15/$0.60 per 1M
        "medium": "claude-sonnet-4-5",      # $3/$15 per 1M
        "complex": "claude-opus-4-6",       # $15/$75 per 1M
        "long_context": "gemini-3-pro",     # $7/$21 per 1M, 2M context
    }
    return routing_map.get(task_complexity, "claude-sonnet-4-5")

Potential savings: 60-80% on mixed workloads.

2. Prompt Caching — Reuse Common Context#

Most providers offer cached input pricing at 75% discount:

python

# Instead of sending full system prompt every time,
# use prompt caching for repeated context
response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[
        {
            "role": "system",
            "content": long_system_prompt,  # This gets cached
            "cache_control": {"type": "ephemeral"}
        },
        {"role": "user", "content": user_query}
    ]
)
# Cached input: $0.75/1M instead of $3.00/1M = 75% savings on system prompt

3. Token Optimization — Reduce Waste#

python

# BAD: Verbose prompt (wastes tokens)
prompt_bad = """
I would like you to please help me write a Python function. 
The function should take a list of numbers as input and return 
the sum of all even numbers in the list. Please make sure to 
include proper error handling and type hints. Thank you!
"""

# GOOD: Concise prompt (saves ~40% tokens)
prompt_good = """
Write a Python function: sum of even numbers from a list. 
Include type hints and error handling.
"""

4. Batch Processing — Reduce Overhead#

python

# Instead of 100 individual API calls, batch related items
items_to_analyze = ["item1", "item2", "item3", ...]

# BAD: One call per item
for item in items_to_analyze:
    response = client.chat.completions.create(
        model="claude-sonnet-4-5",
        messages=[{"role": "user", "content": f"Analyze: {item}"}]
    )

# GOOD: Batch multiple items in one call
batch_prompt = "Analyze each item and return JSON array:\n" + "\n".join(items_to_analyze)
response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[{"role": "user", "content": batch_prompt}],
    response_format={"type": "json_object"}
)

5. Response Length Control#

python

# Set max_tokens to prevent runaway responses
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Summarize this article."}],
    max_tokens=500  # Cap output to ~375 words
)

6. Caching Responses Locally#

python

import hashlib
import json

def cached_completion(client, model, messages, **kwargs):
    """Cache API responses to avoid duplicate calls."""
    cache_key = hashlib.md5(
        json.dumps({"model": model, "messages": messages}).encode()
    ).hexdigest()
    
    cache_file = f".cache/{cache_key}.json"
    
    try:
        with open(cache_file) as f:
            return json.load(f)
    except FileNotFoundError:
        response = client.chat.completions.create(
            model=model, messages=messages, **kwargs
        )
        result = response.choices[0].message.content
        with open(cache_file, "w") as f:
            json.dump(result, f)
        return result

7. Use Crazyrouter for Automatic Savings#

The simplest optimization: route all API calls through Crazyrouter for automatic 20-30% savings with zero code changes:

python

# Just change the base URL — everything else stays the same
client = OpenAI(
    api_key="your-crazyrouter-key",
    base_url="https://api.crazyrouter.com/v1"
)
# Instant 20-30% savings on every API call

Real-World Cost Scenarios#

Scenario 1: AI Chatbot (B2C SaaS)#

Metric	Value
Daily active users	5,000
Messages per user/day	10
Avg input tokens	1,500
Avg output tokens	400
Model	Claude Sonnet 4.5

Monthly cost (official): $2,700 **Monthly cost ([Crazyrouter](https://docs.crazyrouter.com/en/introduction)):**$ 1,890
Annual savings: $9,720

Scenario 2: Code Review Tool (Developer Tool)#

Metric	Value
Daily reviews	500
Avg input tokens	8,000 (code context)
Avg output tokens	2,000 (review comments)
Model	Claude Opus 4.6

Monthly cost (official): $4,050 **Monthly cost (Crazyrouter):**$ 2,835
Annual savings: $14,580

Scenario 3: Document Processing Pipeline#

Metric	Value
Documents per day	200
Avg input tokens	20,000
Avg output tokens	1,000
Model	Gemini 2.5 Flash

Monthly cost (official): $54 **Monthly cost (Crazyrouter):**$ 37.80
Annual savings: $194

Frequently Asked Questions#

How do I count tokens before making an API call?#

Use the tiktoken library for OpenAI models or Anthropic's token counting API. For a quick estimate, divide your character count by 4 (English) or 2 (Chinese).

Which AI model gives the best value for money?#

For most tasks, Gemini 2.5 Flash ( $0.15/$ 0.60 per 1M tokens) offers the best price-to-performance ratio. For complex tasks requiring frontier intelligence, Claude Sonnet 4.5 at $3/$ 15 is the sweet spot.

How can I reduce AI API costs without sacrificing quality?#

Use model routing (cheap models for simple tasks, expensive models for complex ones), prompt caching, and an API gateway like Crazyrouter for automatic discounts.

What's the cheapest way to access GPT-5 and Claude?#

Through Crazyrouter, which offers 20-30% discounts on all major models with a single API key and OpenAI-compatible format.

How much does it cost to run an AI chatbot?#

It depends on traffic and model choice. A chatbot with 5,000 daily users using Claude Sonnet 4.5 costs approximately $1,890/month through Crazyrouter. Using Gemini 2.5 Flash, the same traffic costs under$ 100/month.

Summary#

Understanding and optimizing AI API costs is crucial for building sustainable AI products. The key strategies are: use model routing for mixed workloads, leverage prompt caching, optimize prompts for conciseness, and use Crazyrouter for automatic 20-30% savings across 300+ models.

Start optimizing today: Sign up at Crazyrouter and cut your AI API costs immediately.