Open-Source Robotics Learning Datasets
A curated catalog of open-source datasets for robot manipulation, imitation learning, and reinforcement learning — with links to official sources.
Real-World Manipulation Data
Datasets with in-the-wild robot interactions and long-horizon tasks.
CollectionBenchmark-Centric Datasets
Suites designed for reproducible evaluation and cross-paper comparison.
CollectionCross-Robot Ecosystems
Shared formats and multi-embodiment data for foundation model training.
High-Intent Dataset Guides
Teleoperation datasets
Operator demos, retries, and bootstrapping workflows.
Dataset GuideContact-rich datasets
Tactile, force, and failure-heavy manipulation signals.
Industry GuideWarehouse datasets
SKU variation, exception handling, and throughput context.
Industry GuideLab automation datasets
Repeatable protocols and benchmarkable workflows.
Pilot GuideHumanoid datasets
Deployment-oriented data choices for humanoid teams.
OpenArm GuideOpenArm datasets
Collection and packaging workflows around OpenArm.
Popular Categories
Popular Tags
Datasets for Robot Learning

DROID
76K trajectories, 350 hours, 86 tasks. In-the-wild manipulation from 50 collectors across 564 scenes. TensorFlow Datasets, Hugging Face.
View dataset →
BridgeData V2
60K trajectories, 24 environments, 13 manipulation skills. Low-cost WidowX robot. Natural language labels, multi-task learning.
View dataset →
Open X-Embodiment
1M+ episodes, 22 robot types, 500+ skills. Unified RLDS format. RT-X models. 33 institutions.
View dataset →
ALOHA
Bimanual teleoperation. ALOHA-Cosmos-Policy, baseline datasets. HDF5, Hugging Face. Open hardware.
View dataset →
LIBERO
130 tasks, 65K demos. Lifelong learning benchmark. Spatial, object, goal suites. RoboSuite simulation.
View dataset →
RoboNet
15M frames, 7 robot platforms. Multi-robot transfer. Sawyer, Franka, Baxter, Fetch, WidowX.
View dataset →
RoboMimic & MimicGen
Framework + datasets. MimicGen: 50K demos from 200 human demos. Simulation + real. MIT license.
View dataset →
LeRobot
Standardized format + hub. DROID-100, ALOHA, SO-100. PyTorch, streaming. "ImageNet of robotics."
View dataset →Models & Tools You Can Pair
Research-Ready Curation
We highlight scale, format, and access details needed for quick evaluation.
Cross-Stack Compatibility
Datasets are mapped to practical model and tool ecosystems.
Deployment Context
Dataset choices are linked with real robot execution constraints.
Scale-up Path
When open data is not enough, we support custom collection pipelines.