Question 1

How many tokens is a typical book?

Accepted Answer

A 300-page novel runs roughly 100K-150K tokens. War and Peace clocks in around 750K. A short blog post is 1K-3K. A one-hour meeting transcript is about 8K-12K tokens. The 1M-token Gemini context window fits roughly 1500 pages.

Question 2

What does the context window include?

Accepted Answer

Everything you send: system prompt, conversation history, retrieved documents, function definitions, tool outputs — plus the model's response. If you have a 128K window and your prompt is 120K tokens, you only have 8K left for the response. Cap max_tokens accordingly.

Question 3

Does Gemini's 1M context actually work?

Accepted Answer

Yes, but performance degrades on needle-in-a-haystack tasks past ~500K tokens. For document QA, 1M works well. For multi-hop reasoning over the full window, expect quality drops. RAG with smaller chunks often outperforms full-document stuffing.

Question 4

How do I count tokens for a PDF or document?

Accepted Answer

Convert to plain text first, then estimate: roughly 1 token per 4 characters of English. PDFs often contain layout tokens (line breaks, headers) that inflate counts by 10-20%. Use the official tokenizer SDK for precise counts.

Question 5

What happens if I exceed the context window?

Accepted Answer

The API returns an error and the request fails. Most SDKs surface a 'context_length_exceeded' error. Truncate the oldest messages, summarize prior turns, or move to a model with a larger window. Some frameworks (LangChain, LlamaIndex) handle this automatically.

Context Window Calculator

Fit across context windows

About This Tool

What is a context window?

Conversion ratios

What counts toward the window?

Long-context performance

Frequently Asked Questions

You might also like

Base64 Encoder & Decoder

AI Image Generation Cost

CSV to JSON Converter