Question 1

How is GPT API cost calculated?

Accepted Answer

OpenAI bills per million tokens, with separate input and output rates. A request with 1000 input tokens and 500 output tokens on GPT-4o costs (1000/1M × $2.50) + (500/1M × $10) = $0.0075 per call.

Question 2

What's the cheapest GPT model?

Accepted Answer

GPT-4o-mini at $0.15 input and $0.60 output per million tokens. It's roughly 17x cheaper than GPT-4o on input and handles most tasks acceptably. Use it for high-volume classification, extraction, and chat.

Question 3

Why are output tokens more expensive than input?

Accepted Answer

Generation is more compute-intensive than ingestion. Each output token requires a full forward pass; input tokens are processed in parallel. OpenAI's 4x output multiplier reflects that GPU cost asymmetry.

Question 4

Does prompt caching reduce GPT cost?

Accepted Answer

Yes. OpenAI offers automatic prompt caching that drops cached input to 50% of base price. Anthropic Claude offers deeper discounts (90% off cached reads). Cache requires identical prefixes of at least 1024 tokens.

Question 5

How do I estimate monthly cost?

Accepted Answer

Multiply per-call cost by request volume. Example: 50K requests/month at $0.0075 each = $375/month. Add 20% buffer for retries and longer-than-expected outputs.

Model	Input	Output	Per call	Monthly
GPT-4o Flagship, multimodal	$2.5/M	$10/M	$0.007500	$75.00
GPT-4o-mini Cheap, fast, capable	$0.15/M	$0.6/M	$0.000450	$4.50
GPT-4-turbo Legacy, larger context	$10/M	$30/M	$0.0250	$250.00
GPT-3.5-turbo Cheapest legacy option	$0.5/M	$1.5/M	$0.001250	$12.50

GPT API Cost Calculator

About This Tool

How OpenAI bills

When to pick which model

Cost-reduction levers

Frequently Asked Questions

You might also like

UUID/ULID Generator

Cron Expression Generator

Base64 Encoder & Decoder