LLM API Pricing Comparison 2026 – GPT-4o, Claude, Grok, Gemini & More

OpenAI ✓ Verified LLM API

OpenAI API · Text & multimodal models priced per token.

gpt-4o

Flagship multimodal model

In $2.50 Cached $1.250 Out $10.00

varies

per 1M tokens

gpt-4o-mini

Fast and cheap for many workloads

In $0.15 Cached $0.075 Out $0.60

varies

per 1M tokens

gpt-4.1

Strong reasoning and coding

In $2.50 Cached $0.250 Out $15.00

varies

per 1M tokens

gpt-4.1-mini

Budget reasoning and coding

In $0.75 Cached $0.075 Out $4.50

varies

per 1M tokens

Advanced reasoning with chain-of-thought

In $15.00 Cached $7.500 Out $60.00

varies

per 1M tokens

o3-mini

Cost-effective reasoning model

In $1.10 Cached $0.550 Out $4.40

varies

per 1M tokens

gpt-3.5-turbo

Budget workhorse for simple tasks

In $0.50 Out $1.50

varies

per 1M tokens

Get started View details

Updated Mon & Thu

Anthropic ✓ Verified LLM API

Claude API · Claude models priced per token, plus caching options.

Claude Opus 4.6

Most capable Claude model; complex reasoning and long context

In $5.00 Cached $0.500 Out $25.00

varies

per 1M tokens

Claude Sonnet 4.6

Balanced performance and cost

In $3.00 Cached $0.300 Out $15.00

varies

per 1M tokens

Claude Haiku 4.5

Fastest and cheapest Claude model

In $1.00 Cached $0.100 Out $5.00

varies

per 1M tokens

Get started View details

Updated Mon & Thu

Google ✓ Verified LLM API

Gemini Developer API · Gemini models with free tier and paid per-token pricing.

Free Tier

Free-of-charge usage with rate limits

Free tier is quota-limited; see official Gemini pricing page for current limits.

tier

Gemini 2.5 Flash

Latest fast multimodal model — replaces 2.0 Flash

In $0.50 Cached $0.050 Out $3.00

varies

per 1M tokens

Gemini 2.5 Flash-Lite

Most cost-effective Gemini option

In $0.25 Cached $0.050 Out $1.50

varies

per 1M tokens

Gemini 2.0 Flash

Deprecating June 2026 — migrate to 2.5 Flash

In $0.25 Cached $0.050 Out $1.50

varies

per 1M tokens

Get started View details

Updated Mon & Thu

xAI ✓ Verified LLM API

Grok API · Grok models with per-token pricing and cached input discounts.

grok-4

Frontier Grok model

In $2.00 Cached $0.750 Out $6.00

varies

per 1M tokens

grok-4.1-fast

Budget-tier fast model; 2M token context

In $0.20 Out $0.50

varies

per 1M tokens

grok-3

Strong reasoning and coding

In $3.00 Cached $0.750 Out $15.00

varies

per 1M tokens

grok-3-mini

Cost-effective reasoning model

In $0.30 Cached $0.075 Out $0.50

varies

per 1M tokens

Get started View details

Updated Mon & Thu

DeepSeek LLM API

DeepSeek API · High-performance open-weight models with extremely low token pricing.

deepseek-chat (V3)

General-purpose chat model

In $0.14 Cached $0.014 Out $0.28

Pay-as-you-go

per 1M tokens

deepseek-reasoner (R1)

Chain-of-thought reasoning model

In $0.55 Cached $0.140 Out $2.19

Pay-as-you-go

per 1M tokens

Get started View details

Updated Mon & Thu

Groq LLM API

Groq API · Ultra-fast LLM inference on custom LPU hardware.

Free Tier

Rate-limited free access for testing and prototyping

Free tier has rate limits; see Groq console for current limits.

tier

llama-3.3-70b

Llama 3.3 70B — fast inference, strong quality

In $0.59 Out $0.79

varies

per 1M tokens

llama-3.1-8b

Very cheap, fastest option for simple tasks

In $0.05 Out $0.08

varies

per 1M tokens

mixtral-8x7b

Mixtral MoE — balanced speed and quality

In $0.24 Out $0.24

varies

per 1M tokens

gemma2-9b

Google Gemma 2 9B on Groq LPU

In $0.20 Out $0.20

varies

per 1M tokens

Get started View details

Updated Mon & Thu

Mistral AI LLM API

Mistral API · European AI with strong multilingual and coding models.

mistral-small-3.1

Efficient general-purpose model

In $0.10 Out $0.30

varies

per 1M tokens

mistral-medium-3

Balanced performance and cost for demanding tasks

In $0.40 Out $1.20

varies

per 1M tokens

mistral-large-2

Top-tier Mistral model for complex tasks

In $2.00 Out $6.00

varies

per 1M tokens

codestral

Specialized for code generation and completion

In $0.10 Out $0.30

varies

per 1M tokens

Get started View details

Updated Mon & Thu

Together AI LLM API

Together Inference API · Run popular open-source models at competitive prices.

Llama 3.3 70B Turbo

Meta Llama 3.3 70B — fast turbo variant

In $0.88 Out $0.88

varies

per 1M tokens

Llama 3.1 8B Turbo

Cheap and fast for high-volume tasks

In $0.18 Out $0.18

varies

per 1M tokens

Llama 3.1 405B

Largest open-source Llama model

In $3.50 Out $3.50

varies

per 1M tokens

Qwen2.5 72B Instruct

Strong multilingual and coding model

In $1.20 Out $1.20

varies

per 1M tokens

DeepSeek-R1

Open-source reasoning model via Together

In $3.00 Out $7.00

varies

per 1M tokens

Get started View details

Updated Mon & Thu

API Marketplace

Filters

CloudMart AI

Filters

Stay Updated

🧮 Token Cost Calculator

View 7 models

View 3 models

View 4 models

View 4 models

View 2 models

View 5 models

View 4 models

View 5 models

CloudMart AI

Side-by-Side Comparison