
"Google Veo3 API Guide: Generate AI Videos with Audio in 2026"
Google's Veo3 is one of the most impressive AI video generation models available today. What sets it apart from competitors like Sora and Kling is its ability to generate native audio alongside video — dialogue, sound effects, and ambient audio are all synthesized in a single generation pass.
This guide covers everything you need to know about using the Veo3 API: setup, code examples, pricing, and how it compares to alternatives.
What Is Google Veo3?#
Veo3 is Google DeepMind's third-generation video generation model. Released in mid-2025, it represents a significant leap in AI video quality:
- Native audio generation: Veo3 generates synchronized audio (speech, SFX, ambient sound) alongside video — no separate TTS or audio model needed
- Up to 8 seconds of high-quality video per generation
- 1080p resolution output
- Text-to-video and image-to-video modes
- Consistent character rendering across frames
- Physics-aware motion: Realistic object interactions, fluid dynamics, and lighting
Veo3 is available through Google AI Studio, Vertex AI, and third-party API providers like Crazyrouter.
Veo3 API Setup#
Option 1: Google Vertex AI (Direct)#
To use Veo3 directly through Google:
- Create a Google Cloud project
- Enable the Vertex AI API
- Set up authentication (service account or OAuth)
- Use the
generativelanguageor Vertex AI endpoint
This requires a Google Cloud account with billing enabled and can be complex to set up.
Option 2: Crazyrouter (Recommended for Simplicity)#
Crazyrouter provides Veo3 access through a simple, OpenAI-compatible API. No Google Cloud setup required:
- Sign up at crazyrouter.com
- Get your API key from the dashboard
- Start generating videos immediately
Veo3 API Code Examples#
Python — Text to Video#
import requests
import time
import base64
API_KEY = "your-crazyrouter-key"
BASE_URL = "https://api.crazyrouter.com"
# Step 1: Submit video generation task
response = requests.post(
f"{BASE_URL}/v1/videos/generations",
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
},
json={
"model": "veo3",
"prompt": "A developer sitting at a desk, typing on a mechanical keyboard. The camera slowly zooms in. The developer says 'This API is incredible' with genuine excitement. Ambient office sounds in the background.",
"size": "1920x1080",
"duration": 6
}
)
task = response.json()
task_id = task["id"]
print(f"Task submitted: {task_id}")
# Step 2: Poll for completion
while True:
status = requests.get(
f"{BASE_URL}/v1/videos/generations/{task_id}",
headers={"Authorization": f"Bearer {API_KEY}"}
).json()
if status["status"] == "completed":
video_url = status["data"][0]["url"]
print(f"Video ready: {video_url}")
break
elif status["status"] == "failed":
print(f"Generation failed: {status.get('error')}")
break
print(f"Status: {status['status']}... waiting")
time.sleep(10)
Node.js — Text to Video#
const API_KEY = 'your-crazyrouter-key';
const BASE_URL = 'https://api.crazyrouter.com';
async function generateVideo() {
// Submit generation task
const response = await fetch(`${BASE_URL}/v1/videos/generations`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'veo3',
prompt: 'A cat playing piano in a jazz club, dim lighting, audience clapping. The cat meows rhythmically along with the music.',
size: '1920x1080',
duration: 6
})
});
const task = await response.json();
console.log(`Task submitted: ${task.id}`);
// Poll for completion
while (true) {
const status = await fetch(
`${BASE_URL}/v1/videos/generations/${task.id}`,
{ headers: { 'Authorization': `Bearer ${API_KEY}` } }
).then(r => r.json());
if (status.status === 'completed') {
console.log(`Video ready: ${status.data[0].url}`);
return status.data[0].url;
}
if (status.status === 'failed') {
throw new Error(status.error);
}
console.log(`Status: ${status.status}...`);
await new Promise(r => setTimeout(r, 10000));
}
}
generateVideo().catch(console.error);
cURL#
# Submit video generation
curl -X POST https://api.crazyrouter.com/v1/videos/generations \
-H "Authorization: Bearer your-crazyrouter-key" \
-H "Content-Type: application/json" \
-d '{
"model": "veo3",
"prompt": "Aerial drone shot of a futuristic city at sunset, flying cars, neon lights reflecting off glass buildings. Ambient electronic music plays softly.",
"size": "1920x1080",
"duration": 6
}'
# Check status (replace TASK_ID)
curl https://api.crazyrouter.com/v1/videos/generations/TASK_ID \
-H "Authorization: Bearer your-crazyrouter-key"
Veo3 Prompt Engineering Tips#
Getting the best results from Veo3 requires thoughtful prompting. Here are proven techniques:
1. Describe Audio Explicitly#
Since Veo3 generates audio natively, include audio descriptions in your prompt:
"A waterfall in a tropical forest. The sound of rushing water grows louder
as the camera approaches. Birds chirping in the background."
2. Specify Camera Movement#
"Slow dolly shot moving through a library. Camera tracks left to right
past bookshelves. Soft ambient music."
3. Include Dialogue#
Veo3 can generate speech. Include dialogue naturally:
"A chef in a kitchen presents a dish to the camera and says
'Today we're making the perfect risotto' in a warm, confident voice."
4. Set the Mood#
"Cinematic, warm color grading. Golden hour lighting.
Shallow depth of field. Film grain texture."
Veo3 Pricing Comparison#
| Provider | Model | Price per Video | Resolution | Max Duration | Audio |
|---|---|---|---|---|---|
| Google (Direct) | Veo3 | ~$0.35/sec | 1080p | 8s | ✅ |
| Crazyrouter | Veo3 | ~$0.25/sec | 1080p | 8s | ✅ |
| OpenAI | Sora | ~$0.40/sec | 1080p | 20s | ❌ |
| Kling | Kling 2.0 | ~$0.20/sec | 1080p | 10s | ❌ |
| Luma | Ray 2 | ~$0.15/sec | 720p | 5s | ❌ |
Prices are approximate and may vary based on resolution and duration settings.
Veo3 vs Sora vs Kling: Feature Comparison#
| Feature | Veo3 | Sora | Kling 2.0 |
|---|---|---|---|
| Native Audio | ✅ | ❌ | ❌ |
| Max Resolution | 1080p | 1080p | 1080p |
| Max Duration | 8s | 20s | 10s |
| Image-to-Video | ✅ | ✅ | ✅ |
| Character Consistency | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| Physics Realism | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| Text Rendering | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐ |
| API Availability | ✅ | ✅ | ✅ |
| Speed | ~2 min | ~3 min | ~1 min |
Common Use Cases#
Marketing & Ads#
Generate product demo videos with voiceover without hiring a production team:
prompt = """
Product showcase: A sleek wireless headphone on a marble surface.
Camera orbits slowly around the product.
A narrator says 'Experience sound like never before.
Introducing the AirPods Max 3.'
Soft electronic background music.
"""
Social Media Content#
Create short-form video content for TikTok, Instagram Reels, or YouTube Shorts:
prompt = """
Vertical video (9:16). A person unboxing a mystery package.
They open it with excitement and say 'No way!'
Quick cuts, energetic pacing. Upbeat pop music.
"""
Education & Training#
Generate explainer videos with narration:
prompt = """
Educational animation showing how neural networks work.
Nodes light up as data flows through layers.
A calm narrator explains 'Each neuron processes information
and passes it to the next layer.'
"""
Frequently Asked Questions#
Is Veo3 API free to use?#
No, Veo3 is a paid API. Google offers limited free credits for new Vertex AI users. Through Crazyrouter, you can start with a small balance and pay only for what you generate.
Can Veo3 generate videos longer than 8 seconds?#
Currently, Veo3's maximum generation length is 8 seconds per request. For longer videos, you can chain multiple generations together or use video-to-video extension techniques.
Does Veo3 support image-to-video?#
Yes, Veo3 supports both text-to-video and image-to-video generation. You can provide a reference image as the starting frame and describe the desired motion and audio.
How does Veo3's audio quality compare to dedicated TTS?#
Veo3's audio is impressive for an integrated solution — dialogue is intelligible and sound effects are contextually appropriate. However, for production-quality voiceover, you may still want to use a dedicated TTS service and composite the audio separately.
Can I use Veo3-generated videos commercially?#
Usage rights depend on your agreement with the API provider. Google's terms generally allow commercial use of generated content. Check the specific terms of your provider.
Summary#
Veo3 is the first AI video model that truly delivers on the promise of end-to-end video generation — video and audio in a single pass. For developers building video-powered applications, it's a game-changer.
The easiest way to get started is through Crazyrouter, which provides Veo3 alongside other video models (Sora, Kling, Luma) through a unified API. One key, all models, competitive pricing.


