Dev Tools

AI Fine-Tuning Cost Calculator

Training cost plus monthly inference markup for OpenAI fine-tunes.

Quick Answer

GPT-4o-mini training: $3/M tokens. GPT-4o: $25/M. Inference markup is 2x base for mini, 1.5x for GPT-4o. A 500K-token fine-tune on mini costs $1.50 to train, then $0.30/$1.20 per 1M tokens served (vs $0.15/$0.60 base).

ModelTrainingFT inference / moBase / moYear 1 total
GPT-4o-mini fine-tune
$3/M training
$1.50$5.40$2.70$66.30
GPT-4o fine-tune
$25/M training
$12.50$67.50$45.00$822.50
GPT-3.5-turbo fine-tune
$8/M training
$4.00$42.00$8.00$508.00

About This Tool

The AI Fine-Tuning Cost Calculator separates training cost from the long-tail inference markup that fine-tunes carry forever. Enter your training corpus size and monthly inference volume, and the tool computes both the one-time training fee and the recurring 1.5-2x inference premium versus running the base model.

OpenAI fine-tuning pricing (April 2026)

Training: GPT-4o-mini at $3/M tokens, GPT-4o at $25/M, GPT-3.5-turbo at $8/M. Inference rates jump too: GPT-4o-mini fine-tune costs $0.30 input / $1.20 output (2x base), GPT-4o fine-tune is $3.75 / $15 (1.5x base), GPT-3.5 fine-tune is $3 / $6 (6x and 4x base — the worst markup ratio).

The total-cost-of-ownership trap

Teams often celebrate cheap training cost ($1-$50 for most fine-tunes) and forget the inference markup compounds month over month. A fine-tuned GPT-4o-mini that processes 100M tokens of input and 20M of output per month costs $54 vs $27 base — $27/month extra, $324/year. Make sure the quality lift earns at least that.

When fine-tuning is worth it

High-volume, narrow-task workloads where prompt size dominates cost. If you can compress a 5K-token system prompt and few-shot examples into the model weights, fine-tuning saves on every call. Style consistency (brand voice, tone), structured outputs (custom JSON schemas), and domain vocabulary are the strongest fits.

When to skip it

For most teams under 50M tokens/month, prompt engineering plus prompt caching beats fine-tuning on TCO. Caching gives you the 90% discount on repeated prefixes without the training step or the permanent inference premium. Try caching first — see prompt caching savings calculator. Compare against retrieval at RAG vs fine-tune calculator.

Pair with the GPT cost calculator, LLM cost comparison, and AI monthly budget calculator for full-stack budgeting. Estimate token volume from text with token counter.

Frequently Asked Questions

How much does OpenAI fine-tuning cost?
Training: $3 per million tokens for GPT-4o-mini, $25/M for GPT-4o, $8/M for GPT-3.5. Inference uses bumped rates: GPT-4o-mini fine-tune is $0.30/$1.20 vs $0.15/$0.60 base — a 2x markup. GPT-4o fine-tune is 1.5x base inference rate.
Does Anthropic offer fine-tuning?
Yes, but only on Bedrock (AWS partnership) and via direct enterprise contracts. Pricing is custom but typically 2-3x base inference rate plus a per-job training fee. Most teams use prompt engineering and caching instead — Sonnet 4.6 with a strong system prompt often matches a fine-tune.
When does fine-tuning beat prompt engineering?
When you have 1000+ high-quality examples and a stable use case. Fine-tunes shine on style consistency, structured output formats, and domain language (medical, legal, code). For most applications, retrieval (RAG) or few-shot prompting reaches 90% of fine-tune quality at lower cost.
How many training tokens do I need?
Minimum 10K tokens for measurable effect. Sweet spot is 100K-1M tokens (1000-10000 examples). Above that, returns diminish. A 500K-token training run on GPT-4o-mini costs $1.50 — cheap to experiment with.
Are fine-tune inference rates permanent?
Yes — once trained, every call to your fine-tune costs the bumped rate forever. This is the long-tail cost. A model that processes 100M tokens/year at $0.30 input vs $0.15 base = $15K/year extra for fine-tuning. Make sure the quality lift justifies it.