comparisonMar 22, 20268 min read

AI Model Pricing Compared: OpenAI vs Anthropic vs Google — Which Saves You More?

Side-by-side pricing for every major AI model in March 2026. Updated monthly.


AI Model Pricing Compared: OpenAI vs Anthropic vs Google — Which Saves You More?

Choosing the right AI model isn't just about quality — it's about cost. A 10x pricing difference between models that perform similarly on your task means you're either saving thousands or burning them.

This guide compares every major AI model's pricing as of March 2026, organized by use case so you can find the cheapest option that meets your quality bar.

We update this page monthly. Last update: March 22, 2026.


The Full Pricing Table

Flagship Models (Best Quality)

ProviderModelInput/1M tokensOutput/1M tokensContextBest For
OpenAIGPT-4o$2.50$10.00128KGeneral-purpose, creative, analysis
OpenAIGPT-4.1$2.00$8.001MLong-context coding, document analysis
OpenAIo3$2.00$8.00200KComplex reasoning, math, planning
AnthropicClaude Sonnet 4.6$3.00$15.00200KNuanced writing, analysis, coding
AnthropicClaude Opus 4.6$5.00$25.00200KHardest tasks, research, long reasoning
GoogleGemini 2.5 Pro$1.25$10.001MMultimodal, long context, cost-efficient
GoogleGemini 3.1 Pro$2.00$12.00200KLatest quality tier
MistralMistral Large$2.00$6.00128KMultilingual, EU-hosted option
CohereCommand-A$2.50$10.00128KRAG, enterprise search

Mid-Tier Models (Good Balance)

ProviderModelInput/1M tokensOutput/1M tokensContextBest For
OpenAIGPT-4.1-mini$0.40$1.601MLong docs at low cost
OpenAIo3-mini$1.10$4.40200KReasoning at lower cost
OpenAIo4-mini$1.10$4.40200KLatest reasoning, mid-tier
AnthropicClaude Haiku 4.5$1.00$5.00200KFast, capable, affordable
GoogleGemini 2.5 Flash$0.30$2.501MFast responses, 1M context
MistralMistral Medium$0.40$2.00128KBalanced European option

Budget Models (Cheapest)

ProviderModelInput/1M tokensOutput/1M tokensContextBest For
OpenAIGPT-4o-mini$0.15$0.60128KClassification, extraction, simple gen
OpenAIGPT-4.1-nano$0.10$0.401MCheapest OpenAI with huge context
AnthropicClaude 3.5 Haiku$0.80$4.00200KBudget Anthropic
AnthropicClaude 3 Haiku$0.25$1.25200KCheapest Anthropic
GoogleGemini 2.0 Flash$0.10$0.401MUltra-cheap with 1M context
GoogleGemini 2.5 Flash Lite$0.10$0.401MLatest budget tier
MistralMistral Small$0.10$0.30128KCheapest Mistral
MistralMistral Nemo$0.02$0.04128KUltra-budget tasks
GroqLlama 3.3 70B$0.59$0.79128KFast inference, open weights
GroqLlama 3.1 8B$0.05$0.08128KFastest, cheapest hosted

Cost Per Task: Real Comparisons

Let's compare what common tasks actually cost across providers.

Task 1: Classify a support ticket (300 input tokens, 5 output tokens)

ModelCost Per CallCost at 10K/month
GPT-4o$0.0008$8.00
Claude Sonnet 4.6$0.0010$9.75
Gemini 2.5 Pro$0.0004$4.25
GPT-4o-mini$0.00005$0.48
Gemini 2.0 Flash$0.00003$0.32
Mistral Nemo$0.000006$0.06

Winner: Mistral Nemo at $0.06/month — if quality is sufficient. GPT-4o-mini at $0.48/month is a safe bet. Spending $8.00/month on GPT-4o for classification is 167x overpaying vs Mistral Nemo.

Task 2: Summarize a 5,000-word document (7,500 input tokens, 300 output tokens)

ModelCost Per CallCost at 500/month
GPT-4o$0.022$10.88
Claude Sonnet 4.6$0.027$13.50
Gemini 2.5 Pro$0.012$6.19
GPT-4o-mini$0.001$0.65
Gemini 2.5 Flash$0.003$1.50
GPT-4.1-nano$0.001$0.45

Winner: GPT-4.1-nano at $0.45/month for simple summaries. Gemini 2.5 Flash at $1.50/month for better quality. Using a flagship model costs 10-30x more.

Task 3: Chatbot conversation (20 turns, ~50K input tokens total, ~4K output tokens)

ModelCost Per ConversationCost at 1K conversations/month
GPT-4o$0.165$165.00
Claude Sonnet 4.6$0.210$210.00
Gemini 2.5 Pro$0.103$102.50
GPT-4o-mini$0.010$9.90
Gemini 2.5 Flash$0.025$25.00

Winner: GPT-4o-mini at $9.90/month. Claude Sonnet costs 21x more for a general chatbot.

Task 4: Code generation (2K input tokens, 1K output tokens)

ModelCost Per CallCost at 5K/month
GPT-4o$0.015$75.00
GPT-4.1$0.012$60.00
Claude Sonnet 4.6$0.021$105.00
Claude Opus 4.6$0.035$175.00
Gemini 2.5 Pro$0.013$62.50
GPT-4.1-mini$0.002$10.40

Winner: GPT-4.1-mini at $10.40/month for most code tasks. GPT-4.1 at $60/month when quality matters. Claude Opus at $175/month only for the hardest problems.


Hidden Cost Multipliers

The base prices above don't tell the full story. These multipliers significantly affect your actual bill:

Cache Discounts (save 50-90%)

ProviderCache Read DiscountHow It Works
Anthropic90% offExplicit cache control headers; 5-min TTL
OpenAI50% offAutomatic for prompts >1,024 tokens with matching prefix
Google90% offAutomatic caching
Groq50% offAutomatic caching

Batch API Discount (save 50%)

ProviderBatch DiscountTurnaround
OpenAI50% off24 hours
Anthropic50% off24 hours

Long-Context Surcharge (pay more)

ProviderWhenSurcharge
Google (Pro models)Input >200K tokens2x on input AND output
Google (Flash models)NeverNo surcharge

Reasoning Token Overhead

Models like o1, o3, and o4-mini use "thinking tokens" that count as output tokens but aren't shown in the response. Your actual output token bill can be 2-10x higher than the visible response length.


Provider Comparison: Beyond Price

OpenAI

  • Strength: Widest model range (nano to o1), largest ecosystem, best batch API
  • Weakness: No EU hosting, output tokens are expensive (4x input)
  • Best value: GPT-4o-mini ($0.15/$0.60) and GPT-4.1-nano ($0.10/$0.40)

Anthropic

  • Strength: Best prompt caching (90% off), strong code/analysis quality
  • Weakness: Fewer budget models, expensive flagship (Opus $5/$25)
  • Best value: Claude 3 Haiku ($0.25/$1.25) for simple tasks with caching

Google

  • Strength: Cheapest flagship (Gemini 2.5 Pro $1.25 input), huge context windows (1M tokens), 90% cache discount
  • Weakness: Long-context surcharge on Pro models doubles the price past 200K tokens
  • Best value: Gemini 2.0 Flash ($0.10/$0.40) — incredible value with 1M context

Mistral

  • Strength: EU-hosted, competitive pricing, excellent multilingual support
  • Weakness: Smaller ecosystem, fewer integrations
  • Best value: Mistral Nemo ($0.02/$0.04) — cheapest hosted inference available

Groq

  • Strength: Fastest inference (sub-second), runs open-weight models
  • Weakness: Limited model selection (Llama, Mixtral only)
  • Best value: Llama 3.1 8B ($0.05/$0.08) — fastest and cheapest option

Decision Framework: Which Model Should You Use?

Is the user waiting for a real-time response?
├── No → Use Batch API (50% off any model)
└── Yes
    ├── Does it need complex reasoning/creativity?
    │   ├── Yes → GPT-4o, Claude Sonnet, or Gemini Pro ($1.25-3.00/1M in)
    │   └── No
    │       ├── Does it need >128K context?
    │       │   ├── Yes → GPT-4.1-nano ($0.10/1M) or Gemini 2.0 Flash ($0.10/1M)
    │       │   └── No → GPT-4o-mini ($0.15/1M) or Mistral Small ($0.10/1M)
    │       └── Is it classification/extraction only?
    │           └── Yes → Mistral Nemo ($0.02/1M) or Groq Llama 8B ($0.05/1M)
    └── Does it need EU hosting?
        └── Yes → Mistral models (EU-native)

Track It All in One Place

If you're using multiple models across multiple providers, your cost data is scattered across 3-4 different dashboards. Each gives you one number with no breakdown.

AISpendGuard unifies all your AI spend — OpenAI, Anthropic, Google, Mistral, Cohere, Groq — into a single dashboard with cost attribution by feature, customer, model, and environment. It automatically detects when you're using an expensive model where a cheaper one would work.

Free tier: 50,000 events/month. No credit card. Tags only — we never store your prompts.

Compare your AI costs for free →


This pricing data is updated monthly. All prices are from official provider pricing pages, verified March 2026. Prices shown in USD per 1 million tokens.


Want to track your AI spend automatically?

AISpendGuard detects waste patterns, breaks down costs by feature, and recommends specific changes with $/mo savings estimates.