AI API Pricing Comparison: Cost Per Million Tokens
Data Notice: The numerical data presented in this article are sourced from the most current provider data at publication and may reflect prior-period or projected statistics. Confirm all pricing and features with the provider directly.
AI API Pricing Comparison: Cost Per Million Tokens
How We Evaluated: Our editorial team researched AI API Pricing Comparison using published API pricing, cost-per-task calculations, and rate limit data. Rankings reflect cost per million tokens, rate limits, free tier availability, and volume discounts. Last updated: March 2026. See our editorial policy for full methodology.
AI API pricing changes frequently as providers compete on cost and capability. This page compares current pricing across all major providers in a single, easy-to-reference table. Bookmark it and check back regularly for updates.
AI API Input Pricing: Cost per 1M Tokens (March 2026)
Our ai api pricing comparison: cos assessments incorporate public benchmarks and editorial testing. Actual performance depends on your particular use case and configuration.
Complete Pricing Table (March 2026)
Premium Tier Models
| Model | Provider | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Notes |
|---|---|---|---|---|---|
| Claude Opus 4 | Anthropic | $15.00 | $75.00 | 200K | Strongest reasoning |
| o3 | OpenAI | $10.00 | $40.00 | 200K | + thinking token costs |
| Gemini Ultra | $7.00 | $21.00 | 1M+ | Largest context window |
Mid-Tier Models (Best Value)
| Model | Provider | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Notes |
|---|---|---|---|---|---|
| Claude Sonnet 4 | Anthropic | $3.00 | $15.00 | 200K | Best quality/cost ratio |
| GPT-4o | OpenAI | $2.50 | $10.00 | 128K | Strong generalist |
| Mistral Large | Mistral | $2.00 | $6.00 | 128K | Good multilingual |
| Gemini Pro | $1.25 | $5.00 | 1M+ | Great value with large context |
Budget Tier Models
| Model | Provider | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Notes |
|---|---|---|---|---|---|
| o3-mini | OpenAI | $1.10 | $4.40 | 200K | Budget reasoning |
| Claude Haiku 4 | Anthropic | $0.25 | $1.25 | 200K | Fast, very cheap |
| GPT-4o mini | OpenAI | $0.15 | $0.60 | 128K | Budget general purpose |
| Gemini Flash | $0.075 | $0.30 | 1M+ | Cheapest capable model |
All prices as of March 2026. Check provider websites for the latest pricing.
Cost Per Common Task
| Task | Approximate Tokens | Opus 4 | Sonnet 4 | GPT-4o | Haiku 4 | Flash |
|---|---|---|---|---|---|---|
| Single question + answer | 500 in / 300 out | $0.03 | $0.006 | $0.004 | $0.0005 | $0.0001 |
| Blog post generation | 500 in / 1,500 out | $0.12 | $0.024 | $0.016 | $0.002 | $0.0005 |
| Document summary (20 pages) | 15K in / 500 out | $0.26 | $0.053 | $0.043 | $0.004 | $0.001 |
| Code review (large file) | 5K in / 2K out | $0.23 | $0.045 | $0.033 | $0.004 | $0.001 |
| Full book analysis | 150K in / 2K out | $2.40 | $0.48 | N/A* | $0.04 | $0.01 |
GPT-4o’s 128K limit means it cannot process a full book in one pass.
Cost Reduction Features
Prompt Caching (Anthropic)
Anthropic offers prompt caching that reduces the cost of repeated context (system prompts, reference documents) by up to 90%. This is significant for applications that reuse the same context across many queries.
- Cache write: 1.25x base input cost
- Cache read: 0.1x base input cost (90% discount)
Batch Processing
Several providers offer discounts for non-time-sensitive batch processing:
| Provider | Batch Discount | Turnaround |
|---|---|---|
| Anthropic | 50% off | Within 24 hours |
| OpenAI | 50% off | Within 24 hours |
| Varies | Varies |
Volume Discounts
Enterprise agreements with committed usage can reduce pricing further. Contact providers directly for volume pricing.
Price Trends
AI API pricing has fallen dramatically:
| Year | Cost for GPT-4-class 1M Output Tokens | Reduction |
|---|---|---|
| 2023 | ~$60.00 | Baseline |
| 2024 | ~$15.00 | 75% reduction |
| 2025 | ~$10.00 | 83% reduction |
| 2026 | ~$5-10.00 | 85-92% reduction |
Expect continued price reductions as inference efficiency improves and competition intensifies.
Choosing the Right Tier
Use premium models when:
- Complex reasoning, analysis, or coding is required
- Accuracy on difficult problems justifies the cost
- The cost of errors exceeds the cost of using a better model
Use mid-tier models when:
- You need good quality for general tasks
- You want the best quality-to-cost ratio
- Volume is moderate (hundreds to thousands of queries/day)
Use budget models when:
- Tasks are simple (classification, extraction, routing)
- Volume is very high (millions of queries/day)
- Speed matters more than maximum quality
- You are building a first pass before human review
Key Takeaways
- AI API pricing spans a 100x range from Gemini Flash ($0.075/1M input) to Claude Opus 4 ($15/1M input).
- Mid-tier models (Claude Sonnet 4, GPT-4o) offer the best quality-to-cost ratio for most applications.
- Prompt caching and batch processing can reduce costs by 50-90% for eligible workloads.
- Prices have dropped 85-92% since 2023 and continue falling.
- Output tokens are always more expensive than input tokens (2-5x), so controlling output length saves money.
Next Steps
- Calculate your estimated monthly spend: AI Cost Calculator: Estimate Your Monthly API Spend.
- Understand tokens and pricing in depth: AI Costs Explained: API Pricing, Token Limits, and Hidden Fees.
- Count tokens in your content: Token Counter Tool: Paste Text, See Token Count.
- Compare subscription plans as an alternative to API pricing: ChatGPT Plus vs Claude Pro vs Gemini Advanced: Subscription Comparison.
The comparisons in this guide are for informational purposes and reflects our editorial team’s independent analysis. Capabilities of AI tools used for this topic change often — verify the latest details with each platform.