Japanese MT-Bench

Japanese

Japanese version of MT-Bench for multi-turn conversation evaluation.

Metrics
Score (1-10)

How to Run

Use FastChat with Japanese MT-Bench dataset for GPT-4 judged evaluation

Leaderboard

Rank Model Provider Parameters Score