CALVIN
Composing Actions from Language and Vision — long-horizon, language-conditioned manipulation.
Overview
CALVIN evaluates language-conditioned manipulation over long horizons. Agents must compose multiple skills from natural language instructions. Simulation-based. RoboFlamingo and other VLM-based policies show strong performance.
Official Links
- github.com/mees/calvin — Code & dataset