Dev Tools

Claude API Cost Calculator

Compare Opus 4.7, Sonnet 4.6, and Haiku 4.5 with prompt caching savings.

Quick Answer

Claude Opus 4.7 is $15 / $75 per 1M tokens. Sonnet 4.6 is $3 / $15. Haiku 4.5 is $1 / $5. Prompt caching adds 1.25x premium on cache writes but drops cache reads to 10% of base — saving 60-90% on stable prefixes that repeat across requests.

Prompt caching (optional)

ModelPer callMonthly
Claude Opus 4.7
$15/M in · $75/M out
$0.0900$900.00
Claude Sonnet 4.6
$3/M in · $15/M out
$0.0180$180.00
Claude Haiku 4.5
$1/M in · $5/M out
$0.006000$60.00

About This Tool

The Claude API Cost Calculator handles the three current Anthropic tiers — Opus 4.7, Sonnet 4.6, and Haiku 4.5 — with full support for prompt caching math. Enter input and output tokens per call, optional cached prefix size, the number of times that cache gets reused before refresh, and a monthly request volume. The calculator outputs per-call and monthly cost for each model.

Anthropic pricing (April 2026)

Opus 4.7: $15 input / $75 output per million tokens. Sonnet 4.6: $3 / $15. Haiku 4.5: $1 / $5. The 5x output multiplier is consistent across the lineup. Sonnet sits in a sweet spot — five times cheaper than Opus on input, but loses very little on most evaluation benchmarks for non-frontier tasks.

Prompt caching mechanics

Cache writes cost 1.25x the base input rate. Cache reads cost 0.1x — a flat 90% discount. If you have a 10,000-token system prompt that gets reused across 100 follow-up turns, you pay 1.25x once and 0.1x ninety-nine times. The effective input cost on that prefix collapses from $0.15 per call (Opus) to roughly $0.018 per call after the first.

Caching breaks even at two cache reads. Beyond that, every additional reuse compounds the savings. The default cache TTL is 5 minutes; an extended 1-hour cache is available for batch workloads. Cache hit requires identical prefix bytes — even a single-character change invalidates the entry.

When to skip caching

Skip if your prompt changes on every call. Skip for low-volume workloads where the 1.25x write premium isn't recouped. Skip when prompts are under ~1024 tokens — caching minimums often exceed the savings on small contexts.

Compare pricing against alternatives: GPT API cost calculator, Gemini API cost calculator, and the side-by-side LLM cost comparison. For deeper cache modeling, use the prompt caching savings calculator. To estimate raw token counts from text, use the token counter.

Frequently Asked Questions

How does Claude prompt caching change cost?
Cache writes cost 1.25x the base input price (a one-time premium). Cache reads cost 0.1x the base — a 90% discount. If a 10K-token system prompt is reused 100 times, you write once at 1.25x and read 99 times at 0.1x. That cuts the bill on that prompt by roughly 87% versus paying full price each call.
Which Claude model should I default to?
Sonnet 4.6 at $3 input / $15 output. It handles 90% of production workloads at a fraction of Opus pricing. Reach for Opus 4.7 ($15/$75) when you need the strongest reasoning or coding. Drop to Haiku 4.5 ($1/$5) for high-volume classification and extraction.
When does cached input pay off?
When the same prefix repeats across many requests. Common cases: stable system prompts, retrieved documents reused across follow-up turns, few-shot examples. The break-even is two cache reads — beyond that you save. Cache TTL defaults to 5 minutes; an extended 1-hour cache is also available.
How does Claude pricing compare to GPT-4o?
Claude Sonnet 4.6 ($3/$15) is more expensive than GPT-4o ($2.50/$10) on a per-token basis. Sonnet typically wins on coding, structured output, and long-context tasks. Haiku 4.5 ($1/$5) is more expensive than GPT-4o-mini ($0.15/$0.60), but tends to outperform on instruction following.
Are output tokens really 5x more than input?
Yes. Across all Claude tiers, output costs 5x input. This makes prompt caching especially powerful — caching only affects input, but reducing prompt size also reduces the response wandering, which trims output cost too.