Gemini API Cost Calculator
Compare Gemini 2.5 Pro, Flash, and Flash-Lite costs with long-context pricing.
Quick Answer
Gemini 2.5 Pro is $1.25 / $5 (or $2.50 / $10 above 200K tokens). Flash is $0.30 / $2.50. Flash-Lite is $0.10 / $0.40. Pro's 1M-token context window enables full-document stuffing without RAG.
| Model | In rate | Out rate | Per call | Monthly |
|---|---|---|---|---|
Gemini 2.5 Pro 1M context, top reasoning | $1.25/M | $5/M | $0.006500 | $65.00 |
Gemini 2.5 Flash Fast, multimodal | $0.3/M | $2.5/M | $0.002600 | $26.00 |
Gemini 2.5 Flash-Lite Cheapest, classification | $0.1/M | $0.4/M | $0.000520 | $5.20 |
About This Tool
The Gemini API Cost Calculator handles all three Gemini 2.5 tiers — Pro, Flash, and Flash-Lite — including the long-context pricing kink that hits Pro at 200K input tokens. Enter token counts and request volume, and the tool computes per-call and monthly cost across the lineup.
Gemini pricing (April 2026)
Gemini 2.5 Pro: $1.25 input / $5 output per million tokens up to 200K input tokens. Beyond 200K, the rate doubles to $2.50 / $10 to reflect the cost of attending over very long contexts. Gemini 2.5 Flash: $0.30 / $2.50 with a 1M-token context window. Flash-Lite: $0.10 / $0.40 — the cheapest production-grade model from any major lab.
The 1M-token context window
Pro's defining feature is its 1 million token context. That fits roughly 1500 pages of text or 50 hours of audio transcript. For document Q&A and codebase analysis, this lets you skip RAG entirely and stuff the full corpus into the prompt — at the cost of higher per-call spend and slower latency. The economics break even with RAG around 50K-100K total tokens; below that, stuffing wins.
When to pick which Gemini
Pro for hard reasoning, long documents, and multimodal work. Flash for everyday chat and structured extraction at scale — it's often the best price-per-performance option in the market. Flash-Lite for ultra-high-volume classification, routing, and lightweight agents where the cheapest token is the right token.
Compare against alternatives: GPT API cost calculator, Claude API cost calculator, and the LLM cost comparison. For long-context budgeting, see the context window calculator. Estimate token counts with the token counter.
Frequently Asked Questions
How does Gemini 2.5 Pro long-context pricing work?
Which Gemini model is cheapest?
How does Gemini compare to GPT-4o on cost?
What's special about Gemini's 1M context window?
Does Google offer Gemini context caching?
You might also like
CSV to JSON Converter
Convert CSV data to JSON array of objects with header detection.
⏱ instantDev ToolsPrompt Caching Savings
Anthropic 90% / OpenAI 50% cache discount calculator with break-even.
⏱ instantDev ToolsHTML Entity Encoder/Decoder
Encode and decode HTML entities and special characters.
⏱ instant