OpenAI API Pricing Explained (2026): Cost per Token & Real Examples

Executive Summary
OpenAI's ecosystem has two distinct billing worlds: ChatGPT (subscriptions) and the OpenAI API (usage-based). ChatGPT plans are priced per user/month (Plus, Pro, Business), while the API is metered primarily by tokens (input, cached input, output) and by non-token units when you use tools (web search, file search), storage (vector stores), or sandboxed execution (containers / code interpreter).
For typical production workloads, the largest cost levers are: output tokens (including reasoning tokens for applicable models); model choice and processing tier (Standard vs Batch vs Flex — Batch is 50% cheaper with 24h turnaround); tool calls and storage (web search, file search, containers); and very long context (pricing multipliers for 1M-context models).
For the sample chatbot workload (100k messages/month, 1k tokens/message, 75% input / 25% output), estimated monthly spend is ~$13.75 (GPT-5 Nano), ~$68.75 (GPT-5 Mini), ~$343.75 (GPT-5), or ~$562.50 (GPT-5.4) at Standard pricing.
How OpenAI API Pricing Works
OpenAI bills for input and output tokens. Cost modifiers include:
- Prompt caching: Cached input is ~10x cheaper than fresh input for most models
- Batch API: 50% discount on input and output; 24h completion window
- Flex processing: Batch rates plus prompt caching discounts; slower, may return "resource unavailable"
- Priority processing: Paid tier for lower, more consistent latency
API accounts use prepaid billing: minimum $5 purchase, credits expire after 1 year (non-refundable).
Token Pricing Table (Standard)
| Model | Input (USD/1M) | Cached Input (USD/1M) | Output (USD/1M) |
|---|---|---|---|
| GPT-5.4 | $2.50 | $0.25 | $15.00 |
| GPT-5 | $1.25 | $0.125 | $10.00 |
| GPT-5 Mini | $0.25 | $0.025 | $2.00 |
| GPT-5 Nano | $0.05 | $0.005 | $0.40 |
| GPT-4.1 | $2.00 | $0.50 | $8.00 |
| GPT-4o | $2.50 | $1.25 | $10.00 |
| GPT-4o Mini | $0.15 | $0.075 | $0.60 |
Output pricing explicitly includes reasoning tokens where applicable — "thinking longer" can materially increase spend even when visible text output is short.
Batch API: 50% Cost Reduction
Batch API is designed for non-real-time jobs. You save 50% on both input and output tokens compared to real-time API calls. Ideal for summarisation, classification, extraction, and embeddings. Jobs complete within 24 hours.
Embeddings
Embeddings are priced per input token (no output).
| Model | Standard (USD/1M) | Batch (USD/1M) |
|---|---|---|
| text-embedding-3-small | $0.02 | $0.01 |
| text-embedding-3-large | $0.13 | $0.065 |
| text-embedding-ada-002 | $0.10 | $0.05 |
Cost per 1,000 vectors (512 tokens/vector): text-embedding-3-small $0.01024 (Standard) / $0.00512 (Batch); text-embedding-3-large $0.06656 / $0.03328.
Tools, Storage, and Containers
Hidden cost centers:
- Web search: $10/1K calls + search content tokens at model rates
- File search: Vector storage $0.10/GB/day; tool calls $2.50/1K
- Containers: From Mar 31, 2026: $0.03/container and $0.03/session (20-min session units)
Real Cost Examples
Chatbot: 100k messages/month, 1k token/msg, 75% input / 25% output
75M input + 25M output tokens monthly.
| Model | Monthly Cost |
|---|---|
| GPT-5 Nano | $13.75 |
| GPT-4o Mini | $26.25 |
| GPT-5 Mini | $68.75 |
| GPT-5 | $343.75 |
| GPT-5.4 | $562.50 |
Embedding pipeline: 1M documents × 512 tokens
512M tokens. text-embedding-3-small: $10.24 (Standard) / $5.12 (Batch). text-embedding-3-large: $66.56 / $33.28.
Multimodal: 10k images/month (1024×1024, detail high)
50 text input + 200 output per image. Image tokenization varies by model (tile-based vs patch-based). GPT-5: ~$28.50/month. GPT-4o Mini can be expensive due to high image token footprint — use detail: "low" when possible. Use POST /v1/responses/input_tokens to get exact counts.
Sensitivity: Token Volume and Output Share
At 100k messages/month, cost varies with tokens per message and output share:
| Model | 500 tok/msg | 1000 tok/msg | 2000 tok/msg |
|---|---|---|---|
| GPT-5 Nano (25% out) | $6.88 | $13.75 | $27.50 |
| GPT-5 Mini (25% out) | $34.38 | $68.75 | $137.50 |
| GPT-5 (25% out) | $171.88 | $343.75 | $687.50 |
Shifting output from 15% to 40% increases cost significantly.
Hidden Costs and Constraints
Data residency: 10% uplift for gpt-5.4 and gpt-5.4-pro when using data residency endpoints (e.g. eu.api.openai.com).
Long-context multipliers: For 1M-context models, prompts above published token thresholds can incur higher input/output multipliers.
ChatGPT subscriptions (separate from API): Plus $20/month; Pro $200/month; Business $25–30/seat/month.
How to Reduce OpenAI API Costs
- Model routing: Default to GPT-5 Nano or GPT-5 Mini; escalate to GPT-5/GPT-5.4 only for complex reasoning or high-value requests.
- Constrain output: Cap max_tokens; control reasoning effort where available (reasoning tokens billed as output).
- Prompt caching: Cache system prompts, RAG chunks, repeated context. Cached input is ~10x cheaper.
- Batch or Flex for offline work: Batch gives 50% discount; Flex uses Batch rates plus caching.
- Optimise vision: Use
detail: "low"when fine-grained vision isn't needed; resize images; use /v1/responses/input_tokens to validate. - Minimise tool calls: Web search and file search add per-call fees. Gate tools behind intent classification.
Related
For a side-by-side comparison with Claude and Gemini, see AI API Pricing Comparison (2026).
Planning Your LLM API Budget?
Choosing the right model and optimising token usage can significantly reduce costs. If you need help sizing your API spend or designing cost-efficient AI workflows, let's discuss your requirements.
Book a Free Strategy CallCite this article
- Title: OpenAI API Pricing Explained (2026): Cost per Token & Real Examples
- Author: Nicola Lazzari
- Published: March 1, 2026
- Updated: March 2026
- URL: https://nicolalazzari.ai/articles/openai-api-pricing-explained-2026
- Website: nicolalazzari.ai
- Suggested citation: Nicola Lazzari. OpenAI API Pricing Explained (2026): Cost per Token & Real Examples. nicolalazzari.ai, updated March 2026.
Sources used
Primary sources
- OpenAI API pricing: https://platform.openai.com/docs/pricing
- OpenAI API reference: https://platform.openai.com/docs/api-reference
AI-Readable Summary
- OpenAI API pricing 2026: GPT-5.4, GPT-5, Mini, Nano token costs. Batch API 50% off, embeddings, tools, multimodal, and cost optimisation for developers.
- Key implementation notes, trade-offs, and optimization opportunities.
- Includes context and updates relevant to March 2026.
Key takeaway: OpenAI API pricing 2026: GPT-5.4, GPT-5, Mini, Nano token costs. Batch API 50% off, embeddings, tools, multimodal, and cost optimisation for developers.
Updated
March 2026
Topic
OpenAI API Pricing Explained (2026): Cost per Token & Real Examples
Audience
Developers, founders, product teams
Updated for March 2026 pricing and implementation context.
This article may be referenced in research, documentation, or AI datasets. Please cite the original source when possible.
Frequently Asked Questions
Related reading
Claude API pricing 2026: Opus, Sonnet, Haiku token costs. Prompt caching, Batch API 50% off, tooling, multimodal, no embeddings, and cost optimisation for developers.
Read next: Claude API Pricing Breakdown (2026): Tokens, Limits & Costs →Related Resources
Compare OpenAI, Claude, and Gemini token costs and real usage examples.
Read more →Learn how to identify automation opportunities, choose the right tools, and measure ROI.
Read more →See how AI automation can resolve 70% of customer support tickets automatically.
Read more →Real-world example of AI automation delivering measurable results.
Read more →