Dev Tools

LLM Token Counter

Paste any text to see token count, character count, and live cost across GPT, Claude, and Gemini.

Quick Answer

Token counts approximate the chars-per-4 rule with a word-frequency adjustment. Most English prose lands within 10% of a real tokenizer. Code and non-English text run higher. For billing precision, use the official tokenizer SDK.

Tokens (est.)

0

Characters

0

Words

0

About This Tool

The LLM Token Counter estimates how many tokens any block of text uses across the major language models. Tokens are the atomic billing unit for every paid API: OpenAI, Anthropic, and Google all charge per million tokens, with separate rates for input (your prompt) and output (the model response).

For English prose, the rule of thumb is one token per four characters. This tool blends the chars-per-4 method with a word-frequency adjustment, landing within 5-15% of the real tokenizer in most cases. Code, JSON, emojis, and non-English text all tokenize less efficiently — Chinese characters often consume one to two tokens each, and dense JSON with quoted keys can run 30% higher than the estimate.

Why count tokens before sending?

Three reasons. First, cost: a 10K-token system prompt sent on every API call to Claude Opus 4.7 costs $0.15 per call, or $150 per 1000 calls — caching reduces this dramatically, but only if you know your prompt size. Second, context windows: GPT-4o caps at 128K, Claude at 200K, Gemini 2.5 Pro at 1M. Hit the limit and the request fails. Third, latency: bigger prompts mean slower time-to-first-token.

Output token assumptions

The slider lets you set output tokens as a fraction of input. The default 50% works for chat-style interactions where the model echoes context back. For summarization, drop it to 10-20%. For agents that emit large JSON blobs, push it past 100%. Output tokens cost 4-5x more than input on most models, so they dominate the bill in long-form generation.

For exact counts, use the official tokenizer: tiktoken for OpenAI, the Anthropic SDK's count_tokensendpoint for Claude, or Google's Gemini SDK. Those run server-side and bill nothing for the count itself.

Related tools on hakaru: try the GPT API cost calculator, Claude API cost calculator, LLM cost comparison, character counter, and word counter for related text analysis.

Frequently Asked Questions

How accurate is this token counter?
It's an approximation using the rule that 1 token averages roughly 4 characters in English text. Real tokenizers (tiktoken for GPT, Anthropic's tokenizer for Claude) vary by model. Expect this estimate to be within 5-15% of actual counts. For exact billing, use the model's official tokenizer SDK.
Why do tokens matter for LLM cost?
API providers bill per million tokens, with separate rates for input (your prompt) and output (model response). A 4000-character prompt sent to Claude Opus 4.7 costs about $0.015 per request as input. The same prompt sent 1000 times runs $15 just on input.
Are tokens the same across GPT, Claude, and Gemini?
No. Each model uses a different tokenizer. Claude tokens are typically 10-15% longer than GPT tokens for the same text, while Gemini sits closer to GPT. Code, emojis, and non-English text tokenize differently across all three.
How do I reduce token usage?
Strip filler words ('please', 'really', 'just'), remove redundant context, use bullet points instead of prose, compress system prompts, and offload static context to prompt caching where supported. Try our prompt token optimizer for automated suggestions.
Is my text sent to any server?
No. Token counting happens entirely in your browser. Nothing is uploaded, logged, or stored. Paste sensitive prompts freely.