Cost vs Performance
Compare model performance against API pricing ($/1M tokens)
| Model | Provider | Input $/1M | Output $/1M | Avg Score | Value Score |
|---|---|---|---|---|---|
| Grok 4.1 | xAI | $0.20 | $0.50 | 70.3 | 254.4 |
| DeepSeek-V3.2 | DeepSeek | $0.27 | $1.10 | 77.2 | 182.8 |
| DeepSeek V3 | DeepSeek | Free/OSS | Free/OSS | 78.9 | 157.9 |
| Gemini 3 Flash | $0.50 | $3.00 | 90.4 | 128.8 | |
| DeepSeek-R1 | DeepSeek | $0.55 | $2.19 | 75.3 | 124.8 |
| GPT-5.2 Thinking | OpenAI | $1.75 | $14.00 | 82.7 | 64.8 |
| Gemini 3 Pro | $2.00 | $12.00 | 74.8 | 61.6 | |
| Claude Sonnet 4.5 | Anthropic | $3.00 | $15.00 | 78.8 | 60.4 |
| GPT-5.2 | OpenAI | $1.75 | $14.00 | 66.8 | 52.4 |
| Claude Opus 4.5 | Anthropic | $5.00 | $25.00 | 73.2 | 48.3 |
| Gemini 3 Deep Think | $4.00 | $18.00 | 45.1 | 32.7 | |
| GPT-5.1 | OpenAI | $1.25 | $10.00 | - | - |
| Llama 4 Maverick | Meta | Free/OSS | Free/OSS | - | - |
| Qwen3-235B | Alibaba | $0.50 | $2.00 | - | - |
How Value Score is Calculated
Value Score = Average Benchmark Score / log(Output Price + 1)
Higher is better. OSS/Free models get bonus multiplier.