Japanese version of MT-Bench for multi-turn conversation evaluation.
Use FastChat with Japanese MT-Bench dataset for GPT-4 judged evaluation