Dev Tools

GPT API Cost Calculator

Enter token counts and request volume to compare GPT-4o, mini, and turbo costs.

Quick Answer

GPT-4o is $2.50 input / $10 output per 1M tokens. GPT-4o-mini is $0.15 / $0.60 — roughly 17x cheaper. GPT-4-turbo is $10 / $30 (legacy). Switch to mini for high-volume work unless you specifically need flagship reasoning.

ModelInputOutputPer callMonthly
GPT-4o
Flagship, multimodal
$2.5/M$10/M$0.007500$75.00
GPT-4o-mini
Cheap, fast, capable
$0.15/M$0.6/M$0.000450$4.50
GPT-4-turbo
Legacy, larger context
$10/M$30/M$0.0250$250.00
GPT-3.5-turbo
Cheapest legacy option
$0.5/M$1.5/M$0.001250$12.50

About This Tool

The GPT API Cost Calculator gives you side-by-side cost comparisons across OpenAI's production models. Enter the number of input tokens (your prompt plus system message), output tokens (the model's response), and your expected monthly request volume. The tool computes per-call cost and projected monthly spend for GPT-4o, GPT-4o-mini, GPT-4-turbo, and GPT-3.5-turbo.

How OpenAI bills

OpenAI prices in dollars per million tokens, with separate rates for input and output. As of April 2026, GPT-4o is $2.50 input / $10 output. GPT-4o-mini sits at $0.15 / $0.60 — the workhorse for high-volume tasks. GPT-4-turbo (now legacy) is $10 / $30, and GPT-3.5-turbo, still kept for backward compatibility, runs $0.50 / $1.50.

When to pick which model

Default to GPT-4o-mini. It handles classification, extraction, summarization, and most chat workloads at quality close to GPT-4o for a fraction of the price. Reach for GPT-4o when you need the strongest reasoning, multimodal vision, or function calling reliability. Skip GPT-4-turbo unless you have legacy code paths — GPT-4o is faster and cheaper at higher quality.

Cost-reduction levers

Three levers cut bills the most. Prompt caching: OpenAI's automatic cache drops cached input to 50% of base. Batch API: 50% off both input and output for jobs you can wait 24 hours on. Output truncation: cap max_tokens aggressively — output tokens cost 4x input, and unbounded responses sneak up fast.

See the Claude API cost calculator, Gemini API cost calculator, and side-by-side LLM cost comparison to evaluate alternatives. Use the token counter to estimate input size from raw text. For volume-based budgeting, try the AI monthly budget calculator.

Frequently Asked Questions

How is GPT API cost calculated?
OpenAI bills per million tokens, with separate input and output rates. A request with 1000 input tokens and 500 output tokens on GPT-4o costs (1000/1M × $2.50) + (500/1M × $10) = $0.0075 per call.
What's the cheapest GPT model?
GPT-4o-mini at $0.15 input and $0.60 output per million tokens. It's roughly 17x cheaper than GPT-4o on input and handles most tasks acceptably. Use it for high-volume classification, extraction, and chat.
Why are output tokens more expensive than input?
Generation is more compute-intensive than ingestion. Each output token requires a full forward pass; input tokens are processed in parallel. OpenAI's 4x output multiplier reflects that GPU cost asymmetry.
Does prompt caching reduce GPT cost?
Yes. OpenAI offers automatic prompt caching that drops cached input to 50% of base price. Anthropic Claude offers deeper discounts (90% off cached reads). Cache requires identical prefixes of at least 1024 tokens.
How do I estimate monthly cost?
Multiply per-call cost by request volume. Example: 50K requests/month at $0.0075 each = $375/month. Add 20% buffer for retries and longer-than-expected outputs.