HumanEval
CodingHow to Run
pip install human-eval && generate_samples MODEL && evaluate_functional_correctness samples.jsonl
Leaderboard
| Rank | Model | Provider | Parameters | Score |
|---|---|---|---|---|
| 1 | DeepSeek V3 | DeepSeek | Unknown | 91.5% |
pip install human-eval && generate_samples MODEL && evaluate_functional_correctness samples.jsonl
| Rank | Model | Provider | Parameters | Score |
|---|---|---|---|---|
| 1 | DeepSeek V3 | DeepSeek | Unknown | 91.5% |