Assembly101: Multi-View Assembly and Disassembly Dataset

513 hours of procedural assembly from 12 synchronized cameras. 1M+ fine-grained action annotations for manufacturing robotics.

CC-BY-NC -- Non-Commercial MP4 + Annotations 12 Cameras + Ego

Key Stats

MetricValue
Duration513 hours
Participants53
Videos4,321
Camera views12 synchronized (8 fixed + 4 egocentric)
Annotations100K+ coarse, 1M+ fine-grained action labels, hand pose
Size~1.5 TB
LicenseCC-BY-NC-4.0

What is Assembly101?

Assembly101 captures the full complexity of procedural assembly tasks. Participants assemble and disassemble take-apart toys while being recorded from 12 synchronized cameras -- 8 fixed external views and 4 head-mounted egocentric cameras. The multi-view setup provides complete 3D coverage of hand-object interactions during assembly.

The annotation depth is exceptional: 100K+ coarse action labels, 1M+ fine-grained temporal segments, and 3D hand pose estimates. This makes Assembly101 uniquely valuable for manufacturing robotics research, where robots need to understand assembly sequences, detect errors, and plan corrective actions.

Access

Note: CC-BY-NC-4.0 license. Free for research, not for commercial use.

Visit Project Page

Related datasets

  • EPIC-KITCHENS -- egocentric kitchen activities
  • Ego4D -- massive egocentric dataset from Meta
  • RH20T -- multi-modal manipulation with force/tactile

Manufacturing automation data

We collect custom assembly and disassembly data for manufacturing robotics applications.