← Benchmarks

CALVIN

Composing Actions from Language and Vision — long-horizon, language-conditioned manipulation.

Overview

CALVIN evaluates language-conditioned manipulation over long horizons. Agents must compose multiple skills from natural language instructions. Simulation-based. RoboFlamingo and other VLM-based policies show strong performance.

Official Links