AI API Pricing Calculator (2026): Compare Model Costs by Workload

Published: Mar 27, 2026 Updated: Jun 22, 2026

•Nicola Lazzari

AI API Pricing Calculator

Monthly Cost Calculator

Estimate spend across major AI model APIs with one workload baseline. Prices last synced 2026-06-22, validated Jun 22, 2026. Verify against provider pricing pages before shipping billing logic.

Last updated: Jun 22, 2026

ModelMessages per monthAvg tokens per messageInput ratio: 75%Cached input ratio: 45%

How to use these sliders

Input ratio controls how many tokens are prompt/context tokens vs output tokens. Cached input ratio controls how much of those input tokens can use the lower cached-input price.

Example: with 1,000 tokens per message, 75% input means 750 input tokens and 250 output tokens. If cached input is 45%, then about 338 input tokens use cached pricing.

Estimated monthly spend

$67.50

Based on GPT-5.5 Pro indicative token rates.

Best for: High-quality reasoning agents and premium customer support flows.

Input cost

$22.50

0.75M input tokens

Output cost

$45.00

0.25M output tokens

Cached input

0.34M

Applied at cached token pricing when available

Model cost ranking

Provider	Model	Best for	Monthly cost	Confidence
AWS Bedrock	Amazon Nova Micro	Very low-cost AWS-native high-volume assistant traffic.	$0.06	Verified
Google Gemini	Gemini 2.0 Flash-Lite (batch)	Large offline workloads where throughput cost matters most.	$0.07	Estimated
Cohere	Command R	Lightweight enterprise assistants with strong retrieval grounding.	$0.07	Verified
Google Gemini	Gemini 2.5 Flash-Lite (batch)	Large offline workloads where throughput cost matters most.	$0.07	Verified
Google Gemini	Gemini 2.0 Flash (batch)	Large offline workloads where throughput cost matters most.	$0.08	Estimated
Mistral	Ministral 3 3B	Low-cost production chat and extraction at scale.	$0.10	Verified
AWS Bedrock	Amazon Nova Lite	AWS-native stacks with centralized Bedrock governance.	$0.11	Verified
Mistral	Mistral Small 3.2	Budget-first automation pipelines and compact production assistants.	$0.11	Verified
Google Gemini	Gemini 2.0 Flash (standard)	Fast multimodal app features and responsive production assistants.	$0.11	Estimated
DeepSeek	DeepSeek V4 Flash	Budget-first automation pipelines and coding helpers.	$0.11	Verified
Google Gemini	Gemini 2.0 Flash-Lite (standard)	Very low-cost, high-throughput chat and extraction pipelines.	$0.13	Verified
Google Gemini	Gemini 2.5 Flash-Lite (standard)	Very low-cost, high-throughput chat and extraction pipelines.	$0.14	Verified
Vertex AI	Gemini 2.5 Flash Lite (standard)	Very low-cost, high-throughput chat and extraction pipelines.	$0.14	Verified
Mistral	Mistral Small Creative	Budget-first automation pipelines and compact production assistants.	$0.15	Estimated
Mistral	Ministral 3 14B	Low-cost production chat and extraction at scale.	$0.20	Verified
Vertex AI	Gemini 2.5 Flash Lite (priority)	Latency-critical premium Google Cloud deployments.	$0.26	Verified
xAI	Grok 4.1 Fast Reasoning	Ultra-low-cost, high-throughput conversational endpoints.	$0.28	Verified
xAI	Grok 4.1 Fast Non-Reasoning	Ultra-low-cost, high-throughput conversational endpoints.	$0.28	Verified
DeepSeek	deepseek-chat	Budget-first automation pipelines and coding helpers.	$0.30	Verified
OpenAI	GPT-5.4 nano	Ultra-low-cost classification, tagging, and lightweight automation.	$0.40	Verified
Vertex AI	Gemini 2.5 Flash (flex batch)	Large offline workloads where throughput cost matters most.	$0.43	Verified
DeepSeek	DeepSeek V4 Pro	Budget-first automation pipelines and coding helpers.	$0.54	Verified
Azure OpenAI	GPT-5-mini Global	Balanced assistants and production chat at moderate cost.	$0.61	Needs review
Perplexity	Sonar	Search-augmented assistants and answer engines.	$0.68	Verified
Vertex AI	Gemini 2.5 Flash (standard)	Fast multimodal app features and responsive production assistants.	$0.76	Verified
Google Gemini	Gemini 2.5 Flash (standard)	Fast multimodal app features and responsive production assistants.	$0.77	Verified
Mistral	Mistral Medium 3.1	Cost-efficient extraction and multilingual production assistants.	$0.80	Verified
DeepSeek	deepseek-reasoner	Budget reasoning workflows and cost-aware analysis agents.	$0.82	Verified
Anthropic	Claude Haiku 3.5	Fast, efficient support bots and high-volume text transformations.	$1.36	Verified
Vertex AI	Gemini 2.5 Flash (priority)	Latency-critical premium Google Cloud deployments.	$1.36	Verified
AWS Bedrock	Amazon Nova Pro	Higher-quality AWS-native copilots with managed platform controls.	$1.40	Verified
OpenAI	GPT-5.4 mini	Balanced assistants and production chat at moderate cost.	$1.46	Verified
Google Gemini	Gemini 2.5 Pro (batch)	Large offline workloads where throughput cost matters most.	$1.55	Verified
xAI	Grok 4.3	Reasoning-heavy assistants and premium chat experiences.	$1.56	Verified
xAI	Grok 4.20 Reasoning	Reasoning-heavy assistants and premium chat experiences.	$1.56	Verified
xAI	Grok 4.20 Non-Reasoning	Reasoning-heavy assistants and premium chat experiences.	$1.56	Verified
Anthropic	Claude Haiku 4.5	Fast, efficient support bots and high-volume text transformations.	$1.70	Verified
Vertex AI	Gemini 2.5 Pro (flex batch)	Large offline workloads where throughput cost matters most.	$1.72	Verified
AWS Bedrock	Amazon Nova Pro (Latency Optimized)	Higher-quality AWS-native copilots with managed platform controls.	$1.75	Verified
Mistral	Magistral Medium 1.2	Cost-efficient extraction and multilingual production assistants.	$2.75	Verified
Mistral	Mistral Large 3	Low-cost production chat and extraction at scale.	$3.00	Verified
Mistral	Mistral Large 2.1	Low-cost production chat and extraction at scale.	$3.00	Verified
Google Gemini	Gemini 2.5 Pro (standard)	Balanced quality/price workloads and multimodal app features.	$3.06	Verified
Azure OpenAI	GPT-5 Codex Global	Coding copilots, refactors, and engineering automation in Azure stacks.	$3.06	Verified
Vertex AI	Gemini 2.5 Pro (standard)	Balanced quality/price workloads and multimodal app features.	$3.06	Verified
Perplexity	Sonar Reasoning Pro	Search-augmented assistants for high-quality answer generation.	$3.50	Verified
Perplexity	Sonar Deep Research	Research workflows that require citations and deeper answer planning.	$3.50	Verified
Cohere	Command A	Enterprise RAG and command-style business copilots.	$4.38	Verified
Cohere	Command R+	High-recall enterprise retrieval and grounded QA workloads.	$4.38	Verified
OpenAI	GPT-5.4	High-quality reasoning agents and premium customer support flows.	$4.87	Verified
Anthropic	Claude Sonnet 4.6	Long-context assistants with strong instruction following.	$5.09	Verified
Anthropic	Claude Sonnet 4.5	Long-context assistants with strong instruction following.	$5.09	Verified
Anthropic	Claude Sonnet 4	Long-context assistants with strong instruction following.	$5.09	Verified
Anthropic	Claude Sonnet 3.7	Long-context assistants with strong instruction following.	$5.09	Needs review
Vertex AI	Gemini 2.5 Pro (priority)	Latency-critical premium Google Cloud deployments.	$5.51	Verified
Perplexity	Sonar Pro	Search-augmented assistants for high-quality answer generation.	$6.00	Verified
Anthropic	Claude Opus 4.6	Complex reasoning, long-form analysis, and high-stakes outputs.	$8.48	Verified
Anthropic	Claude Opus 4.5	Complex reasoning, long-form analysis, and high-stakes outputs.	$8.48	Verified
Anthropic	Claude Opus 4.7	Complex reasoning, long-form analysis, and high-stakes outputs.	$10.00	Verified
OpenAI	GPT-5.5	High-quality reasoning agents and premium customer support flows.	$11.25	Verified
Anthropic	Claude Opus 4.1	Complex reasoning, long-form analysis, and high-stakes outputs.	$25.44	Verified
Anthropic	Claude Opus 4	Complex reasoning, long-form analysis, and high-stakes outputs.	$25.44	Verified
OpenAI	GPT-5.5 Pro	High-quality reasoning agents and premium customer support flows.	$67.50	Verified

How these costs are calculated

Token prices come from our official baseline dataset and are validated against external sources when available. The calculator then applies your workload inputs (messages, tokens/message, input/output split, and cached-input ratio) to estimate monthly spend.

Source of truth: ai-provider-pricing-normalized-all-models.json → validated output: ai-provider-pricing-validated.json.

Disclaimer: pricing can change at any time across providers, tiers, and regions. We update this dataset regularly, but you should always confirm final numbers on official provider pricing pages before billing decisions.

External validation sources: OpenRouter API, PricePerToken, and Helicone LLM Cost. Availability and formats may vary over time.

Supported AI providers

OpenAI

Anthropic

Google Gemini Mistral logo

Mistral

Cohere

xAI

Perplexity

DeepSeek

AWS Bedrock Azure OpenAI logo

Azure OpenAI Vertex AI logo

Vertex AI

Meta / Llama API(roadmap)

Embed

Embed this calculator

Use this snippet on custom sites and WordPress (Custom HTML block). You are free to embed it; please keep attribution.

<iframe src="https://nicolalazzari.ai
/embed/ai-api-pricing-calculator" width="100%" height="980" style="border:0;border-radius:12px;overflow:hidden" loading="lazy" referrerpolicy="strict-origin-when-cross-origin" title="AI API Pricing Calculator by Nicola Lazzari"></iframe>

Tip: append ?ref=your-domain.com in the iframe URL to track referrals.

What this calculator does

This calculator helps you estimate monthly AI API spend from a single workload baseline: messages per month, average tokens per message, input/output split, and cache ratio.

It is designed to compare providers quickly before you lock in model routing logic.

Data source and pricing model

The calculator reads its live model list from ai-provider-pricing-validated.json — a dataset validated against OpenRouter and third-party aggregators to ensure pricing accuracy.

Each model also includes a short Best for note to speed up model selection beyond pure token price.

How to use it

Pick a model from the selector.
Set your monthly messages and average tokens per message.
Adjust input ratio and cached input ratio.
Review estimated monthly cost and compare all models in the table.

The result is directional and should be validated against each provider billing page before production rollout.

Need help choosing the right AI model mix?

I can help you design model routing, caching strategy, and cost controls so your AI product scales without billing surprises.

Book a Free Strategy Call

Cite this article

Title: AI API Pricing Calculator (2026): Compare Model Costs by Workload
Author: Nicola Lazzari
Published: March 27, 2026
Updated: June 2026
URL: https://nicolalazzari.ai/articles/ai-api-pricing-calculator-2026
Website: nicolalazzari.ai
Suggested citation: Nicola Lazzari. AI API Pricing Calculator (2026): Compare Model Costs by Workload. nicolalazzari.ai, updated June 2026.

Sources used

Primary sources

Validated pricing dataset (JSON): https://nicolalazzari.ai/ai-provider-pricing-validated.json

AI-Readable Summary

Interactive calculator for monthly AI API spend using one workload baseline.
Compares multiple providers and models, including cache-aware pricing behavior.
Model entries include a short best-fit explanation to support routing decisions.

Key takeaway: Interactive calculator for monthly AI API spend using one workload baseline.

Updated

June 2026

Topic

AI API Pricing Calculator (2026): Compare Model Costs by Workload

Audience

Developers, founders, product teams

This article may be referenced in research, documentation, or AI datasets. Please cite the original source when possible.

Frequently Asked Questions

The live model list is regenerated from the validated JSON pipeline on a weekly schedule and can be updated manually; the article shows the last sync date next to the calculator. Always confirm against each vendor’s pricing page before contractual commitments.

It focuses on token and cached-input mathematics for each listed model. Separate per-call tool fees, search grounding, vector storage, or container runtime are not modeled—add those from the provider console if your app relies on them.

Yes. It includes active priced models from OpenAI, Anthropic, Google, Mistral, Cohere, xAI, Perplexity, DeepSeek, AWS Bedrock, Azure OpenAI, and Vertex AI based on the JSON source file.

Yes. If a model exposes cached-input pricing, the calculator applies your selected cache ratio. If not, it falls back to normal input pricing for those tokens.

Use it as a planning baseline. Final invoices vary with provider-specific details such as tool calls, storage, tiering, region, taxes, and model updates.

Related Resources

Consulting

AI Consultant Services

Get expert help with AI integration, conversion optimization, and experimentation strategy.

Browse Q&A →View Case Studies →Explore Guides →

AI API Pricing Calculator (2026): Compare Model Costs by Workload

Monthly Cost Calculator

What this calculator does

Data source and pricing model

How to use it

Need help choosing the right AI model mix?

Cite this article

Sources used

Primary sources

AI-Readable Summary

Frequently Asked Questions

Related reading

Related Resources