Claude API Cost Calculator
Compare Opus 4.7, Sonnet 4.6, and Haiku 4.5 with prompt caching savings.
Quick Answer
Claude Opus 4.7 is $15 / $75 per 1M tokens. Sonnet 4.6 is $3 / $15. Haiku 4.5 is $1 / $5. Prompt caching adds 1.25x premium on cache writes but drops cache reads to 10% of base — saving 60-90% on stable prefixes that repeat across requests.
Prompt caching (optional)
| Model | Per call | Monthly |
|---|---|---|
Claude Opus 4.7 $15/M in · $75/M out | $0.0900 | $900.00 |
Claude Sonnet 4.6 $3/M in · $15/M out | $0.0180 | $180.00 |
Claude Haiku 4.5 $1/M in · $5/M out | $0.006000 | $60.00 |
About This Tool
The Claude API Cost Calculator handles the three current Anthropic tiers — Opus 4.7, Sonnet 4.6, and Haiku 4.5 — with full support for prompt caching math. Enter input and output tokens per call, optional cached prefix size, the number of times that cache gets reused before refresh, and a monthly request volume. The calculator outputs per-call and monthly cost for each model.
Anthropic pricing (April 2026)
Opus 4.7: $15 input / $75 output per million tokens. Sonnet 4.6: $3 / $15. Haiku 4.5: $1 / $5. The 5x output multiplier is consistent across the lineup. Sonnet sits in a sweet spot — five times cheaper than Opus on input, but loses very little on most evaluation benchmarks for non-frontier tasks.
Prompt caching mechanics
Cache writes cost 1.25x the base input rate. Cache reads cost 0.1x — a flat 90% discount. If you have a 10,000-token system prompt that gets reused across 100 follow-up turns, you pay 1.25x once and 0.1x ninety-nine times. The effective input cost on that prefix collapses from $0.15 per call (Opus) to roughly $0.018 per call after the first.
Caching breaks even at two cache reads. Beyond that, every additional reuse compounds the savings. The default cache TTL is 5 minutes; an extended 1-hour cache is available for batch workloads. Cache hit requires identical prefix bytes — even a single-character change invalidates the entry.
When to skip caching
Skip if your prompt changes on every call. Skip for low-volume workloads where the 1.25x write premium isn't recouped. Skip when prompts are under ~1024 tokens — caching minimums often exceed the savings on small contexts.
Compare pricing against alternatives: GPT API cost calculator, Gemini API cost calculator, and the side-by-side LLM cost comparison. For deeper cache modeling, use the prompt caching savings calculator. To estimate raw token counts from text, use the token counter.
Frequently Asked Questions
How does Claude prompt caching change cost?
Which Claude model should I default to?
When does cached input pay off?
How does Claude pricing compare to GPT-4o?
Are output tokens really 5x more than input?
You might also like
Whisper API Cost Calculator
OpenAI Whisper at $0.006/min plus Deepgram and AssemblyAI comparison.
⏱ instantDev ToolsJSON to YAML Converter
Convert between JSON and YAML formats with real-time preview.
⏱ instantDev ToolsGemini API Cost Calculator
Gemini 2.5 Pro, Flash, Flash-Lite cost with 1M-token long-context tier.
⏱ instant