Transparent by Design

How We Calculate Your AI Costs

Every cent traced. No estimates, no surprises. Here is the exact math behind every cost we report — for every provider we support.

The Formula

Four steps, applied to every API call.

Base Token Cost

Fresh input tokens and output tokens, priced per model.

inputCost  = regularInput × inputPrice / 1M
outputCost = outputTokens × outputPrice / 1M

regularInput= inputTokens − cachedTokens − cacheWriteTokens

Cache Adjustments

Cached reads are cheaper. Cache writes may cost a premium.

cacheReadCost  = cachedTokens × inputPrice × readMultiplier / 1M
cacheWriteCost = writeTokens × inputPrice × writeMultiplier / 1M

Provider	Cache Read	Cache Write (5m)	Cache Write (1h)
Anthropic	0.1×	1.25×	2.0×
OpenAI (GPT-4.1 / o3 / o4-mini)	0.25×	—	—
OpenAI (GPT-4o / o1)	0.5×	—	—
Google	0.1×	Storage-based (per hour)
Groq	0.5×	—	—
Mistral / Cohere	No caching available

OpenAI and Groq cache writes are automatic at no extra cost. Google charges for cache storage per hour separately. Mistral and Cohere do not offer prompt caching.

Mode Multipliers

Applied to the token cost from steps 1 + 2.

tokenCost = inputCost + cacheReadCost + cacheWriteCost + outputCost

Condition	Multiplier	When
Batch API	0.5×	`is_batch_api: true`
Long Context (Anthropic)	2.0× input 1.5× output	`input_tokens > 200,000`
Long Context (Google Pro)	2.0× input 2.0× output	`input_tokens > 200,000` (Flash exempt)
Fast Mode (Anthropic)	6.0×	`is_fast_mode: true`

Long context applies separately to input and output costs before they are summed. Other providers do not have documented long-context surcharges.

Tool Fees

Flat per-call fees added on top of token costs. Varies by provider.

toolCost  = webSearchCount × searchFee(provider)
totalCost = tokenCost + toolCost

Provider	Web Search (per call)	Web Fetch
OpenAI	$0.010	Free
Anthropic	$0.010	Free
Google	$0.014	Free
Groq	$0.005	Free
Mistral / Cohere	—	—

Web fetch is free for all providers — you only pay the token cost for fetched content.

What If You Don't Send a Field?

Every optional field has a safe default. You only need to send what you know.

Field	Default	Effect
`input_tokens_cached`	0	All input tokens billed at full price
`input_tokens_cache_write`	0	No cache write premium applied
`cache_ttl`	"5m"	Uses 1.25× write multiplier (not 2.0×)
`web_search_count`	0	No search fees added
`web_fetch_count`	0	No fees — web fetch is free (only token costs)
`is_batch_api`	false	No 50% discount applied
`is_fast_mode`	false	No 6× multiplier applied
`cost_usd`	auto-calculated	Server computes cost from token counts + model price
`resolved_model`	null	Falls back to `model` field for price lookup

If the model is not found in our price database, cost_usd is left blank. You can always override by sending your own cost_usd value.

Worked Examples

Real numbers for Anthropic, OpenAI, and Google.

claude-sonnet-4-5AnthropicCache TTL: 1h

$3.00 / MTok input • $15.00 / MTok output • cache read 0.1× • cache write 2.0× (1h TTL)

Inputs

input_tokens	12,000
cache_read_input_tokens	8,000
cache_creation_input_tokens	2,000
output_tokens	500
cache_ttl	"1h"
web_search_count	2

Calculation

regularInput = 12,000 − 8,000 − 2,000 = 2,000

inputCost     = 2,000 × $3.00 / 1M = $0.006000
cacheReadCost  = 8,000 × $3.00 × 0.1 / 1M = $0.002400
cacheWriteCost = 2,000 × $3.00 × 2.0 / 1M = $0.012000
outputCost    = 500 × $15.00 / 1M = $0.007500

tokenCost = $0.006000 + $0.002400 + $0.012000 + $0.007500
tokenCost = $0.027900

toolCost  = 2 × $0.01 = $0.020000

totalCost = $0.027900 + $0.020000 = $0.047900

Total cost for this call$0.0479

gpt-4.1OpenAI75% cache discount

$2.00 / MTok input • $8.00 / MTok output • cache read 0.25× (GPT-4.1 family)

Inputs

input_tokens	50,000
input_tokens_cached	40,000
output_tokens	1,000
web_search_count	1

Calculation

regularInput = 50,000 − 40,000 = 10,000

inputCost     = 10,000 × $2.00 / 1M = $0.020000
cacheReadCost  = 40,000 × $2.00 × 0.25 / 1M = $0.020000
outputCost    = 1,000 × $8.00 / 1M = $0.008000

tokenCost = $0.020000 + $0.020000 + $0.008000
tokenCost = $0.048000

toolCost  = 1 × $0.01 = $0.010000

totalCost = $0.048000 + $0.010000 = $0.058000

Total cost for this call$0.0580

gemini-2.5-proGoogleLong Context > 200K

$1.25 / MTok input • $10.00 / MTok output • long context: 2.0× input, 2.0× output

Inputs

input_tokens	250,000
output_tokens	2,000

Calculation

inputCost  = 250,000 × $1.25 / 1M = $0.312500
outputCost = 2,000 × $10.00 / 1M = $0.020000

Long context (>200K): 2× input, 2× output
inputCost  = $0.312500 × 2.0 = $0.625000
outputCost = $0.020000 × 2.0 = $0.040000

totalCost = $0.625000 + $0.040000 = $0.665000

Total cost for this call$0.6650

Provider Comparison

How pricing features differ across the six providers we track.

Feature	OpenAI	Anthropic	Google	Groq	Mistral	Cohere
Cache Read Discount	50–75%	90%	~90%	50%	—	—
Cache Write Premium	None	1.25–2.0×	Per-hour storage	None	—	—
Batch API	50% off	50% off	50% off	50% off	—	—
Long Context	—	2× / 1.5× >200K	2× / 2× >200K (Pro only)	—	—	—
Web Search Fee	$0.010	$0.010	$0.014	$0.005	—	—
Fast Mode	—	6×	—	—	—	—
Thinking Tokens	Output rate	Output rate	Output rate	—	—	—

What's the Same

All providers use per-token pricing (input + output)
Batch API is universally 50% off where available
Thinking/reasoning tokens are billed at the output token rate
Web fetch is free for all providers

What's Different

Cache discounts range from 50% (OpenAI/Groq) to 90% (Anthropic/Google)
Only Anthropic charges a cache write premium (1.25–2.0×)
Long-context surcharges only apply to Anthropic and Google Pro (>200K tokens; Flash models exempt)
Web search fees range from $0.005 (Groq) to $0.014 (Google) per call
Fast mode (6×) is an Anthropic-only feature

Minimum Required Fields

To get an accurate auto-calculated cost, you need just these fields.

{
  "provider": "anthropic",
  "model": "claude-sonnet-4-5",
  "input_tokens": 12000,
  "output_tokens": 500,
  "latency_ms": 1200,
  "timestamp": "2026-03-08T10:00:00Z",
  "tags": {
    "task_type": "summarize",
    "feature": "support_summary",
    "route": "POST /api/summary"
  }
}

With just these fields, AISpendGuard looks up the model price and calculates the cost automatically. Add optional fields for more accuracy.

View Model Prices Read the Docs Go to Dashboard