RH20T: Contact-Rich Multi-Modal Manipulation Dataset
110,000+ sequences across 7 robot configurations with the richest sensor modality coverage in any open robotics dataset.
Key Stats
| Metric | Value |
|---|---|
| Episodes | 110,000+ contact-rich manipulation sequences |
| Tasks | 147 (48 RLBench + 29 MetaWorld + 70 custom) |
| Robot configs | 7 (6-7 DOF arms), one with fingertip tactile sensing |
| Size | ~5 TB (640x360) / ~40 TB (native 1280x720) |
| Modalities | RGB, depth, binocular IR, joint angles, joint torques, Cartesian pose, 6D force-torque, audio, fingertip tactile (200Hz) |
| License | CC-BY-SA-4.0 (scenes 0001-0005) / CC-BY-NC-4.0 (scenes 0006-0010) |
What makes RH20T unique
RH20T stands out for its breadth of sensor modalities. While most manipulation datasets provide RGB images and joint positions, RH20T adds 6D force-torque sensing, fingertip tactile data at 200Hz, audio, depth, binocular IR, and joint torques. This makes it invaluable for research on contact-rich tasks where visual information alone is insufficient -- think plug insertion, surface wiping, or fine assembly operations.
The dataset covers 147 tasks drawn from RLBench (48), MetaWorld (29), and custom scenarios (70), collected across 7 different robot configurations. The custom tasks emphasize contact-rich interactions that require force feedback to succeed.
License warning
RH20T has a split license. Scenes 0001-0005 use CC-BY-SA-4.0, which allows commercial use as long as derivatives are shared under the same license. Scenes 0006-0010 use CC-BY-NC-4.0, which prohibits commercial use entirely. Plan accordingly based on which scenes you need.
Access
Related datasets
- DROID -- large-scale real-world manipulation (less modality-rich but more scenes)
- Touch-and-Go Tactile -- another tactile-focused dataset
- GelSight Tactile -- vision-based tactile sensing data
- Open X-Embodiment -- cross-embodiment pretraining corpus