The top of the LLM market has never been this competitive, or this confusing. OpenAI, Anthropic, and Google all shipped new frontier models this year, and all three price them differently enough that "which one is cheapest" has no one-word answer. It depends on what your workload actually does.
Here are the list prices we track, followed by the math for three real workloads.
The sticker prices
| Model | Input / M tokens | Output / M tokens | Cached input |
|---|---|---|---|
| gpt-5.5 | $5.00 | $30.00 | $0.50 |
| Claude Opus 4.8 | $5.00 | $25.00 | $0.50 |
| Gemini 3.5 Flash | $1.50 | $9.00 | - |
Three things jump out. OpenAI and Anthropic have converged on identical input pricing ($5/M) and identical cached-input pricing ($0.50/M). Anthropic undercuts OpenAI on output by 17% ($25 vs $30). And Google isn't really playing the same game: Gemini 3.5 Flash is a fast frontier-tier model priced at less than a third of the other two.
Above this tier sit the specialty models: gpt-5.5-pro at $30/$180 and Claude Fable 5 at $10/$50, both aimed at the hardest reasoning work where model quality dominates cost. Most production workloads never need them.
Workload 1: customer support bot
Say 50,000 conversations a month, averaging 400K input and 80K output tokens per thousand conversations - about 20M input and 4M output tokens monthly, with a big reusable system prompt (half the input cacheable).
| Model | Input | Cached | Output | Monthly total |
|---|---|---|---|---|
| Gemini 3.5 Flash | $30.00 | - | $36.00 | $66 |
| Claude Opus 4.8 | $50.00 | $5.00 | $100.00 | $155 |
| gpt-5.5 | $50.00 | $5.00 | $120.00 | $175 |
Gemini wins by a wide margin, and honestly, a support bot rarely needs frontier reasoning at all. Drop to Claude Haiku 4.5 ($1/$5) or gpt-5.4-mini ($0.75/$4.50) and this workload costs under $45. The frontier models are here for comparison, not as the recommendation.
Workload 2: coding agent
Agents are output-heavy and context-heavy: think 60M input (mostly cached repository context) and 15M output tokens a month for a small team's daily driver. Cache pricing decides this one.
| Model | Fresh input (15M) | Cached (45M) | Output (15M) | Monthly total |
|---|---|---|---|---|
| Claude Opus 4.8 | $75.00 | $22.50 | $375.00 | $473 |
| gpt-5.5 | $75.00 | $22.50 | $450.00 | $548 |
| Gemini 3.5 Flash | $90.00 | - | $135.00 | $225 |
Gemini is again cheapest on paper, but coding is where the Claude and GPT frontier models earn their premium - long-horizon agent work is their marquee benchmark. Between those two, Anthropic's cheaper output saves about $75/mo at this volume, roughly 14%.
Workload 3: document analysis pipeline
Heavy input, light output: 100M input tokens of contracts a month, 5M output of extracted summaries. Input price dominates, so gpt-5.5 and Opus 4.8 tie at $500 for input and the output difference is noise. Gemini does the same job for $195. If the extraction is structured and simple, DeepSeek v4-flash does it for about $15.
The honest summary
- Cheapest frontier-adjacent option: Gemini 3.5 Flash, in every scenario we ran.
- Cheapest true frontier: Claude Opus 4.8, thanks to $25 output vs OpenAI's $30.
- Where the premium is worth it: agentic coding and long multi-step reasoning.
- Where it isn't: support bots, classification, extraction. Use the mid-tier.
Prices as tracked on June 27, 2026. The LLM API marketplace has the live board across all 8 providers we track, re-checked twice a week.
Not sure which tier your project needs? Describe it to the AI stack planner and it will do this math for your actual workload.