Manipulation Datasets
The largest category in robot learning data. Curated open-source and custom manipulation datasets spanning tabletop pick-and-place, contact-rich assembly, deformable objects, and tool use — everything a manipulation policy needs to generalize.
Key Open-Source Manipulation Datasets
| Dataset | Episodes | Robot(s) | Format |
|---|---|---|---|
| DROID | 76K trajectories | Franka FR3 | RLDS, HDF5 |
| BridgeData v2 | 60K+ | WidowX-250 | RLDS, HDF5 |
| Open X-Embodiment | 1M+ | 22 robot types | RLDS |
| RoboMimic | Varies per task | Franka (sim + real) | HDF5 |
| MimicGen | 50K+ (generated) | Franka (sim) | HDF5 |
| LIBERO | 65K demos, 130 tasks | RoboSuite sim | HDF5 |
| OpenArm Datasets | Growing collection | OpenArm | LeRobot, HDF5 |
Choosing the Right Dataset
For VLA pretraining, start with Open X-Embodiment or DROID — they cover the broadest range of tasks, environments, and embodiments. For single-task fine-tuning, BridgeData v2 and RoboMimic provide cleaner, more focused task suites. For data augmentation, MimicGen can expand a small seed set by 100x through automated trajectory generation.
If you need data on a specific robot or task that does not exist in any open dataset, SVRC can collect custom manipulation datasets using our teleoperation infrastructure. We operate OpenArm, Franka, UR, and xArm collection stations.
Format Compatibility
Most modern manipulation datasets are available in multiple formats. Use our dataset comparison tools to understand schema differences. The Fearless Data Platform can convert between HDF5, RLDS, and LeRobot formats automatically.