Why tokenizer choice matters for budgets
Different model families use different tokenizers (BPE variants). GPT-4o, Claude, Gemini, and Llama all count tokens slightly differently for the same input — usually within ±15% but enough to matter at scale. For accurate budgeting, run your real prompts through the target model's tokenizer (tiktoken for OpenAI, Anthropic's tokenizer endpoint, Gemini's count_tokens API) before extrapolating to monthly volume.
- • 1 token ≈ 4 English characters, ≈ 0.75 words.
- • 1,000 words ≈ 1,330 tokens. A typical book chapter ≈ 8–12k tokens.
- • Whitespace, punctuation, and uppercase letters all count toward your token bill.
Hidden token costs people forget
System prompts: billed on every single request. A 4,000-token system prompt sent 100,000 times = 400M input tokens. Tool/function schemas: counted as input tokens. Retrieved context (RAG): each retrieved chunk × top_k × every query. Reasoning models (o1/o3, Gemini Thinking, DeepSeek R1): emit thousands of hidden 'thinking' tokens billed as output. Always log token counts per request type and segment your budget by use case.

Vendors price LLMs per million tokens, but humans think in pages, words, and characters. That mental mismatch is why teams either over-estimate cost (and don't ship) or under-estimate (and get a surprise bill). This calculator converts words → tokens → dollars in both directions so you can sanity-check any LLM workload in seconds.
What each input means
Get these inputs right and the output is reliable. Get them wrong and the calculator just multiplies bad assumptions.
Word count
English words in your input or output. Use a real sample, not an idealized one.
Typical range: Email: 100–300. Blog post: 800–2,500. PDF page: 350–500. Book chapter: 3k–8k.
Tokens per word
Tokenizer efficiency. English averages ~1.3 tokens/word; code and non-Latin scripts are higher.
Typical range: English prose: 1.25–1.35. Code: 1.5–2.0. Chinese/Japanese: 1.8–3.0. JSON: 2.0–2.5.
Price per 1M input tokens
Vendor list price for the model you'll use.
Typical range: Mini-tier ~$0.15. Frontier ~$2.50–$3. Reasoning ~$10–$15.
Number of documents
How many docs you'll process at this size.
Typical range: 1 for one-off; 10k–1M for bulk processing pipelines.
Worked examples
Real scenarios with the math walked through line by line.
Summarize 1,000 blog posts with GPT-4o
Scenario: Average 1,800 words/post, 1.3 tokens/word, GPT-4o input at $2.50/1M.
Math: Tokens/doc = 1,800 × 1.3 = 2,340. Total input tokens = 2.34M. Input cost = $5.85. Add ~400 output tokens × 1,000 = 400k × $10/1M = $4. Total ≈ $9.85.
Outcome: Cheaper than people expect. The reason most batch summarization quotes are off by 10x: they're priced on output, not input, despite output being small.
RAG ingestion for 50k PDF pages on Haiku
Scenario: 400 words/page × 1.3 tokens, 50,000 pages, Haiku input at $0.25/1M.
Math: Tokens/page = 520. Total = 26M tokens. Cost = 26 × $0.25 = $6.50 for the full corpus.
Outcome: Embedding + extraction at this volume is cheaper than the engineer-hour it takes to scope it. Always use mini-tier for high-volume batch text work.
Common mistakes
Where this calculation usually goes wrong in the real world.
- Estimating from character count without accounting for tokenizer variance. Code is 50% more tokens per char than prose.
- Pricing summarization on output tokens. Output is usually 5–15% of total cost — input dominates.
- Forgetting that retrieval adds tokens. RAG inflates input by 3–10x because you're sending retrieved chunks plus the user query.
- Using OpenAI's tokenizer math for Anthropic. Different models tokenize differently — counts can vary 15–25%.
- Treating Chinese, Japanese, or Korean text like English. Token counts are 2–3x higher per character.
When to use this calculator
- Scoping a one-shot document processing job before writing any code.
- Translating an enterprise customer's 'we have 200,000 contracts' into a real cost quote.
- Comparing tokenizer efficiency across models for cost-sensitive workloads.
- Validating a vendor invoice when your usage doesn't match expectations.
- Budgeting a knowledge-base ingestion or eval run that involves millions of documents.
Glossary
Tokenizer
Algorithm that splits text into tokens. GPT models use tiktoken; Anthropic uses a similar BPE variant; each gives slightly different counts.
BPE (Byte Pair Encoding)
The dominant tokenization approach. Frequent letter sequences become single tokens, so common English words = 1 token, rare words = 2–4 tokens.
Token budget
Maximum tokens (input + output) allowed per request by a model's context window. Pricing is per token used, not per budget.
Embedding tokens
Tokens sent to an embedding model (separate from chat). Priced per 1M, usually $0.02–$0.13 — much cheaper than chat completion.
More questions answered
Why do my actual token counts disagree with this estimator?
Three reasons: tokenizer variance across providers (±15%), formatting overhead (markdown tables, code blocks add tokens), and hidden system prompts. For exact counts, use the provider's tokenizer SDK (tiktoken for OpenAI, Anthropic's count_tokens endpoint) on real samples.
Do tokens cost the same for embeddings as for chat?
No. Embeddings are dramatically cheaper. OpenAI text-embedding-3-small is $0.02/1M tokens vs. $2.50/1M for GPT-4o input — about 125x cheaper. Use embeddings aggressively for retrieval and dedup; only spend chat tokens when you need generation.
Should I worry about token costs for one-off scripts?
Below ~10k requests/month for most workloads, no. The decision threshold is whether the task is recurring, customer-facing, or in a billing-sensitive product. Internal one-off scripts almost always under-cost relative to engineer time.
Methodology last reviewed: 2026-05 by the RevenueLab editorial team.
FAQ
How many tokens are in 1,000 words?
Roughly 1,300–1,400 tokens for English prose. Technical writing and code average 1,500–2,000 tokens per 1,000 words. Non-Latin scripts (CJK, Arabic, Cyrillic) can use 2–4× more tokens per equivalent word.
How many characters is one token?
About 4 characters of English text, though it varies — common words like 'the' and 'and' are a single token, while uncommon words split into multiple sub-word pieces.
Are input and output tokens priced the same?
No. Output tokens cost 3–5× more than input on most frontier models. GPT-4o: $2.50 in / $10 out per million. Claude Sonnet: $3 in / $15 out. Plan budgets and optimize prompts assuming output is the expensive side.
Does whitespace count as tokens?
Yes. Spaces, newlines, and tabs all consume tokens. Aggressive whitespace stripping or compact JSON can cut 5–15% off input tokens on heavily-formatted prompts.