Login
Back to Blog
Text-Embedding-3-Small API Tutorial - OpenAI Embedding Model Guide

Text-Embedding-3-Small API Tutorial - OpenAI Embedding Model Guide

C
Crazyrouter Team
January 26, 2026
1034 viewsEnglishTutorial
Share:

Building a semantic search engine or RAG (Retrieval-Augmented Generation) system? Text-embedding-3-small is OpenAI's latest embedding model that converts text into numerical vectors, enabling powerful similarity search and content retrieval.

In this guide, you'll learn:

  • What are text embeddings and why they matter
  • How to use text-embedding-3-small API
  • Complete code examples in Python and Node.js
  • Custom dimensions for optimized storage
  • Pricing comparison and cost optimization

What is Text-Embedding-3-Small?#

Text-embedding-3-small is OpenAI's compact embedding model released in January 2024. It converts text into 1536-dimensional vectors that capture semantic meaning, enabling:

  • Semantic Search: Find relevant documents based on meaning, not just keywords
  • RAG Systems: Retrieve context for LLM responses
  • Similarity Matching: Compare text similarity for recommendations
  • Clustering: Group similar documents together
  • Classification: Categorize text based on content

Model Specifications#

SpecificationValue
Model Nametext-embedding-3-small
Default Dimensions1536
Custom Dimensions256, 512, 1024, 1536
Max Input Tokens8,191
OutputNormalized vector

Quick Start#

Prerequisites#

  1. Sign up at Crazyrouter
  2. Get your API key from the dashboard
  3. Python 3.8+ or Node.js 16+

Python Example#

python
from openai import OpenAI

client = OpenAI(
    api_key="your-crazyrouter-api-key",
    base_url="https://crazyrouter.com/v1"
)

# Generate embedding for a single text
response = client.embeddings.create(
    model="text-embedding-3-small",
    input="Machine learning is transforming industries worldwide."
)

embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")  # Output: 1536
print(f"First 5 values: {embedding[:5]}")

Node.js Example#

javascript
import OpenAI from 'openai';

const client = new OpenAI({
    apiKey: 'your-crazyrouter-api-key',
    baseURL: 'https://crazyrouter.com/v1'
});

async function getEmbedding(text) {
    const response = await client.embeddings.create({
        model: 'text-embedding-3-small',
        input: text
    });

    return response.data[0].embedding;
}

// Usage
const embedding = await getEmbedding('Machine learning is amazing');
console.log(`Dimensions: ${embedding.length}`);  // Output: 1536

cURL Example#

bash
curl -X POST https://crazyrouter.com/v1/embeddings \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Hello world"
  }'

Response:

json
{
  "object": "list",
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 2,
    "total_tokens": 2
  },
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [-0.0020785425, -0.049085874, 0.02094679, ...]
    }
  ]
}

Batch Embedding#

Process multiple texts in a single API call for better efficiency:

python
from openai import OpenAI

client = OpenAI(
    api_key="your-crazyrouter-api-key",
    base_url="https://crazyrouter.com/v1"
)

# Batch embedding - multiple texts at once
texts = [
    "Python is a programming language",
    "JavaScript runs in browsers",
    "Machine learning uses neural networks"
]

response = client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)

# Access each embedding
for i, data in enumerate(response.data):
    print(f"Text {i}: {len(data.embedding)} dimensions")

# Output:
# Text 0: 1536 dimensions
# Text 1: 1536 dimensions
# Text 2: 1536 dimensions

Custom Dimensions#

Reduce storage costs by using smaller dimensions. The model supports dimension reduction while maintaining quality:

python
# Use 512 dimensions instead of 1536
response = client.embeddings.create(
    model="text-embedding-3-small",
    input="Your text here",
    dimensions=512  # Options: 256, 512, 1024, 1536
)

embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")  # Output: 512

Dimension Comparison#

DimensionsStorage (per vector)Use Case
2561 KBMobile apps, limited storage
5122 KBBalanced performance
10244 KBHigh accuracy needs
15366 KBMaximum accuracy

Building a Semantic Search System#

Here's a complete example of building a semantic search system:

python
import numpy as np
from openai import OpenAI

client = OpenAI(
    api_key="your-crazyrouter-api-key",
    base_url="https://crazyrouter.com/v1"
)

def get_embedding(text):
    """Get embedding for a single text"""
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

def cosine_similarity(a, b):
    """Calculate cosine similarity between two vectors"""
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Document database
documents = [
    "Python is great for data science and machine learning",
    "JavaScript is essential for web development",
    "Docker containers simplify deployment",
    "Kubernetes orchestrates container workloads",
    "PostgreSQL is a powerful relational database"
]

# Pre-compute embeddings for all documents
doc_embeddings = [get_embedding(doc) for doc in documents]

# Search function
def search(query, top_k=3):
    query_embedding = get_embedding(query)

    # Calculate similarities
    similarities = [
        cosine_similarity(query_embedding, doc_emb)
        for doc_emb in doc_embeddings
    ]

    # Get top results
    results = sorted(
        zip(documents, similarities),
        key=lambda x: x[1],
        reverse=True
    )[:top_k]

    return results

# Example search
results = search("How to deploy applications?")
for doc, score in results:
    print(f"Score: {score:.4f} - {doc}")

# Output:
# Score: 0.8234 - Docker containers simplify deployment
# Score: 0.7891 - Kubernetes orchestrates container workloads
# Score: 0.6543 - PostgreSQL is a powerful relational database

Integration with Vector Databases#

Pinecone Integration#

python
import pinecone
from openai import OpenAI

# Initialize clients
client = OpenAI(
    api_key="your-crazyrouter-api-key",
    base_url="https://crazyrouter.com/v1"
)

pinecone.init(api_key="your-pinecone-key")
index = pinecone.Index("your-index")

def embed_and_upsert(texts, ids):
    """Embed texts and store in Pinecone"""
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=texts
    )

    vectors = [
        (id, data.embedding)
        for id, data in zip(ids, response.data)
    ]

    index.upsert(vectors=vectors)

def search_pinecone(query, top_k=5):
    """Search Pinecone with query embedding"""
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )

    results = index.query(
        vector=response.data[0].embedding,
        top_k=top_k
    )

    return results

ChromaDB Integration#

python
import chromadb
from openai import OpenAI

client = OpenAI(
    api_key="your-crazyrouter-api-key",
    base_url="https://crazyrouter.com/v1"
)

# Initialize ChromaDB
chroma_client = chromadb.Client()
collection = chroma_client.create_collection("documents")

def get_embeddings(texts):
    """Get embeddings for multiple texts"""
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=texts
    )
    return [data.embedding for data in response.data]

# Add documents
documents = ["doc1 content", "doc2 content", "doc3 content"]
embeddings = get_embeddings(documents)

collection.add(
    embeddings=embeddings,
    documents=documents,
    ids=["doc1", "doc2", "doc3"]
)

# Query
query_embedding = get_embeddings(["search query"])[0]
results = collection.query(
    query_embeddings=[query_embedding],
    n_results=3
)

Available Embedding Models#

Crazyrouter provides access to multiple OpenAI embedding models:

ModelDimensionsPrice RatioBest For
text-embedding-3-small15360.01General use, best value
text-embedding-3-large30720.065High precision needs
text-embedding-ada-00215360.05Legacy compatibility

Pricing Comparison#

ProviderModelPrice per 1M tokens
OpenAI Officialtext-embedding-3-small$0.020
Crazyroutertext-embedding-3-small$0.002
OpenAI Officialtext-embedding-3-large$0.130
Crazyroutertext-embedding-3-large$0.013

Pricing Disclaimer: Prices shown are for demonstration and may change. Actual billing is based on real-time prices at request time.

Cost Savings Example:

For a RAG system processing 10M tokens/month:

  • OpenAI Official: $200/month
  • Crazyrouter: $20/month
  • Savings: 90%

Best Practices#

1. Batch Your Requests#

python
# Good - single API call for multiple texts
response = client.embeddings.create(
    model="text-embedding-3-small",
    input=["text1", "text2", "text3"]  # Up to 2048 texts
)

# Bad - multiple API calls
for text in texts:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )

2. Cache Embeddings#

python
import hashlib
import json

embedding_cache = {}

def get_embedding_cached(text):
    # Create cache key
    cache_key = hashlib.md5(text.encode()).hexdigest()

    if cache_key in embedding_cache:
        return embedding_cache[cache_key]

    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )

    embedding = response.data[0].embedding
    embedding_cache[cache_key] = embedding

    return embedding

3. Use Appropriate Dimensions#

  • 256 dimensions: Mobile apps, IoT devices
  • 512 dimensions: Web applications with storage constraints
  • 1024 dimensions: Standard applications
  • 1536 dimensions: Maximum accuracy requirements

Frequently Asked Questions#

What's the difference between text-embedding-3-small and text-embedding-3-large?#

Text-embedding-3-small produces 1536-dimensional vectors and is optimized for cost-efficiency. Text-embedding-3-large produces 3072-dimensional vectors with higher accuracy but at 6.5x the cost. For most applications, text-embedding-3-small provides excellent results.

Can I reduce dimensions after generating embeddings?#

Yes, you can use the dimensions parameter to generate smaller vectors directly. This is more efficient than generating full vectors and truncating them.

How many texts can I embed in one request?#

You can embed up to 2048 texts in a single API request. For large datasets, batch your requests in groups of 2048.

Are the embeddings normalized?#

Yes, text-embedding-3-small returns normalized vectors (unit length), so you can use dot product instead of cosine similarity for faster computation.

Getting Started#

  1. Sign up at Crazyrouter
  2. Get your API key from the dashboard
  3. Install the SDK: pip install openai or npm install openai
  4. Start embedding with the code examples above

Related Articles:

For questions, contact support@crazyrouter.com

Topics

Related Posts

Doubao Seed Code: ByteDance's AI Code Generation Model - Complete API GuideTutorial

Doubao Seed Code: ByteDance's AI Code Generation Model - Complete API Guide

Learn how to use Doubao Seed Code, ByteDance's powerful AI code generation model. Complete API tutorial with Python, Node.js examples and pricing comparison.

Jan 26
"DeepSeek R2 API Guide: How to Use the Next-Gen Reasoning Model"Tutorial

"DeepSeek R2 API Guide: How to Use the Next-Gen Reasoning Model"

Complete guide to DeepSeek R2, the advanced reasoning model. Learn about its capabilities, API integration, pricing, and how it compares to OpenAI o3 and Claude.

Feb 22
"AI Fine-Tuning API Guide 2026: OpenAI, Claude & Open Source Models"Tutorial

"AI Fine-Tuning API Guide 2026: OpenAI, Claude & Open Source Models"

"Complete guide to fine-tuning AI models via API in 2026. Learn how to fine-tune GPT-5, Llama 4, and other models with step-by-step code examples."

Mar 1
"Text-Embedding-3-Small: Complete Guide to OpenAI's Most Popular Embedding Model (2026)"Tutorial

"Text-Embedding-3-Small: Complete Guide to OpenAI's Most Popular Embedding Model (2026)"

"Everything you need to know about text-embedding-3-small: pricing, token limits, dimensions, API usage, dimension reduction, benchmarks, and how it compares to text-embedding-3-large. Includes Python and cURL code examples."

May 3
How to Switch Claude Code to Crazyrouter: Base URL, Setup, and Model RoutingTutorial

How to Switch Claude Code to Crazyrouter: Base URL, Setup, and Model Routing

Move Claude Code to Crazyrouter in minutes. Update your base URL, keep your existing workflow, access more models, and reduce cost with one API gateway.

Feb 15
"AI API Latency Optimization: 10 Proven Strategies to Make Your AI Apps Faster"Tutorial

"AI API Latency Optimization: 10 Proven Strategies to Make Your AI Apps Faster"

"Reduce AI API latency by 50-80% with these proven optimization strategies. From streaming responses and edge routing to model selection and connection pooling."

Mar 4