GPT API Cost Calculator
Enter token counts and request volume to compare GPT-4o, mini, and turbo costs.
Quick Answer
GPT-4o is $2.50 input / $10 output per 1M tokens. GPT-4o-mini is $0.15 / $0.60 — roughly 17x cheaper. GPT-4-turbo is $10 / $30 (legacy). Switch to mini for high-volume work unless you specifically need flagship reasoning.
| Model | Input | Output | Per call | Monthly |
|---|---|---|---|---|
GPT-4o Flagship, multimodal | $2.5/M | $10/M | $0.007500 | $75.00 |
GPT-4o-mini Cheap, fast, capable | $0.15/M | $0.6/M | $0.000450 | $4.50 |
GPT-4-turbo Legacy, larger context | $10/M | $30/M | $0.0250 | $250.00 |
GPT-3.5-turbo Cheapest legacy option | $0.5/M | $1.5/M | $0.001250 | $12.50 |
About This Tool
The GPT API Cost Calculator gives you side-by-side cost comparisons across OpenAI's production models. Enter the number of input tokens (your prompt plus system message), output tokens (the model's response), and your expected monthly request volume. The tool computes per-call cost and projected monthly spend for GPT-4o, GPT-4o-mini, GPT-4-turbo, and GPT-3.5-turbo.
How OpenAI bills
OpenAI prices in dollars per million tokens, with separate rates for input and output. As of April 2026, GPT-4o is $2.50 input / $10 output. GPT-4o-mini sits at $0.15 / $0.60 — the workhorse for high-volume tasks. GPT-4-turbo (now legacy) is $10 / $30, and GPT-3.5-turbo, still kept for backward compatibility, runs $0.50 / $1.50.
When to pick which model
Default to GPT-4o-mini. It handles classification, extraction, summarization, and most chat workloads at quality close to GPT-4o for a fraction of the price. Reach for GPT-4o when you need the strongest reasoning, multimodal vision, or function calling reliability. Skip GPT-4-turbo unless you have legacy code paths — GPT-4o is faster and cheaper at higher quality.
Cost-reduction levers
Three levers cut bills the most. Prompt caching: OpenAI's automatic cache drops cached input to 50% of base. Batch API: 50% off both input and output for jobs you can wait 24 hours on. Output truncation: cap max_tokens aggressively — output tokens cost 4x input, and unbounded responses sneak up fast.
See the Claude API cost calculator, Gemini API cost calculator, and side-by-side LLM cost comparison to evaluate alternatives. Use the token counter to estimate input size from raw text. For volume-based budgeting, try the AI monthly budget calculator.