LogoAIAny
Icon for item

OpenCompass CompassRank

Objective benchmark leaderboard from the OpenCompass community, scoring LLMs and LVLMs across 100+ datasets in five capability dimensions.

Introduction

Purpose

CompassRank is the public leaderboard of the OpenCompass evaluation suite. It offers a reproducible, fully open pipeline that tests large language and multimodal models on >70 benchmarks (~400 k questions) covering knowledge, reasoning, coding, mathematics and instruction following.

Key modules
  1. Distributed evaluator – one-command sharding for billion-parameter models.
  2. Diversified paradigms – zero-shot, few-shot and chain-of-thought templates.
  3. Experiment management – YAML configs + real-time result reporting.
Community & openness

All configs, datasets and reports are Apache-2.0 licensed; contributors can add new models or benchmarks via pull request.

Information

Categories