COLOSSEUM
Large-scale real-robot manipulation benchmark. Diverse tasks and environments.
Overview
COLOSSEUM is a real-robot benchmark with diverse manipulation tasks across multiple environments. Used to evaluate generalization and robustness of VLA and policy models. BridgeVLA achieves 64% success.
Related
- BridgeVLA — 64% on COLOSSEUM