Elo-style leaderboard where millions of crowd votes rank AI chatbots via blind head-to-head “battle mode” comparisons.
VLMEvalKit is an open-source evaluation toolkit for large vision-language models (VLMs/LVLMs). It enables one-command evaluation across many benchmarks, supports generation-based evaluation with optional LLM answer extraction, and provides leaderboards and reproducible pipelines.