LogoAIAny
Icon for item

ABC-130k

Provides 130k+ bimanual teleoperation trajectories for robot imitation learning, recorded on low-cost YAM two-arm rigs and shared as MCAP episodes with subtask annotations, training code, and checkpoints.

Introduction

Large-scale, high-fidelity bimanual datasets are rare; ABC-130k fills that gap by releasing 130K+ teleoperated episodes across ~200 complex manipulation tasks recorded on inexpensive YAM two-arm rigs. That scale and fidelity let labs train and evaluate behavior-cloning policies and study scaling laws without building their own extensive data-collection infrastructure.

What Sets It Apart
  • Size and task diversity: 130K+ trajectories covering ~200 manipulation tasks — so what: enables training larger behavior-cloning models and evaluating generalization across many subtasks that smaller datasets cannot support.
  • Low-cost, repeatable embodiment: data collected on YAM two-arm stations — so what: lowers the barrier for reproducing experiments and comparing policies on a common hardware proxy rather than expensive proprietary platforms.
  • Raw-fidelity MCAP recordings + subtask annotations: full sensor and control streams preserved with separate, revisable subtask labels — so what: supports both end-to-end imitation learning and curated supervised/success classifiers, while letting researchers iterate on labels without touching raw episodes.
  • Accompanying assets and permissive license: training code, checkpoints, and Apache-2.0 licensing — so what: makes it straightforward to reproduce baseline results and use the data in downstream research and prototyping.
Who it's for and trade-offs

Great fit if you need large-scale real-world bimanual demonstrations to train or benchmark behavior cloning, imitation-learning baselines, or study scaling laws for robotics datasets. Also suitable for groups that want a reproducible low-cost rig benchmark.

Look elsewhere if you need a dataset tailored to a different robot morphology (significant sim-to-real transfer or different gripper types), if you lack the storage/compute to handle multi-terabyte recordings, or if you require curated semantic labels beyond subtask annotations (object-level segmentation, per-frame semantic masks).

Information

  • Websitehuggingface.co
  • OrganizationsXDOF, UC Berkeley, Carnegie Mellon University, Massachusetts Institute of Technology, Amazon
  • Published date2026/06/15

Categories