Custom Robot Teleoperation Data Collection

Spec the robot, the task list, the modalities, and the scale — we collect, label, QA, and deliver imitation learning data in the format your training stack expects. 24-hour teleoperation coverage at our San Francisco lab, from $500 per task episode.

Request Samples Get Custom Quote

Trusted at scale

10M+
episodes delivered to date

50+
trained teleoperators on staff

ISO
quality management system

7+
robot embodiments in rotation

Ready-to-License Dataset Packs

If you need imitation learning data today, start with one of our curated dataset packs. Each pack ships cleaned, labeled, and ready for OpenVLA / Octo / Diffusion Policy fine-tuning.

Bimanual Manipulation Pack

20,000 dual-arm episodes across 50 tabletop tasks. ALOHA 2 and bimanual Panda coverage, RGB + wrist cameras, language-annotated.

From $5,000

Humanoid Locomotion Pack

50,000 locomotion trajectories on Unitree G1/H1 across 30 terrains. Joint torques, IMU, full-body keypoints, RGB-D.

From $10,000

Dexterous Grasping Pack

10,000 multi-finger grasp trajectories on Shadow Hand and LEAP Hand across 200 objects. Tactile + force-torque + RGB-D.

From $5,000

VLA Evaluation Pack

2,000 held-out evaluation rollouts on Franka and WidowX, matched distribution to BridgeData V2 and LIBERO. Ideal for benchmarking a new VLA without contaminating your train set.

From $3,000

Commission a Custom Campaign

Every custom project follows a four-step flow. Typical total turnaround is 2-6 weeks depending on scale.

Spec. You share a task brief — robot, objects, scenes, modalities, success criteria, delivery format. We return a detailed data collection plan and a fixed quote within 24 hours.
Pilot. We collect ~100 episodes against your spec, train a sanity-check Diffusion Policy on the pilot data, and deliver raw episodes plus a validation report. You approve, tweak, or kill the spec.
Scale. Once the pilot is approved, we run 24-hour teleoperation coverage to your target episode count with continuous QA and weekly drop deliveries.
Delivery. Final dataset ships in RLDS, HuggingFace, LeRobot, or a custom schema with a CHANGELOG, per-episode metadata, and a reproducible training config.

Data Quality Standards

Every episode goes through a three-stage quality pipeline: automatic heuristic filters (action clipping, sensor dropout, episode length), manual review by a second teleoperator, and a final policy-based sanity check. Less than 0.5% of delivered episodes require post-hoc rework. See our full data services page for protocol details, inter-rater agreement numbers, and sample validation reports.

Licensing & Formats

You choose the delivery format:

RLDS / TFDS. Drop-in compatible with the Open X-Embodiment mix and the Octo / OpenVLA training scripts.
HuggingFace datasets. Parquet-backed, streamable, versioned, ready for a private HF repo.
LeRobot. The Hugging Face LeRobot schema for community-friendly sharing.
Custom schema. Your own HDF5 / Zarr / ROS bag layout — we match whatever your existing pipeline expects.

All custom-collected data is licensed to you exclusively by default. Non-exclusive licensing (to help offset collection cost) is available on request at a discount.

Pricing Guide

Task complexity	Price per episode	Typical use
Simple pick-and-place, 5s episodes	$0.50 - $1	VLA pretraining volume
Mid-complexity manipulation, 15-30s	$1 - $3	Standard imitation learning
Long-horizon, bimanual, or contact-rich	$3 - $5	Production policy fine-tuning
Humanoid / dexterous, full body	$5 - $15	Whole-body policy training
Novel embodiment (customer-shipped robot)	$500 per task episode	Bespoke R&D collection

Volume discounts kick in above 10,000 episodes. Full project budgets typically range from $5,000 for a pilot to $50,000+ for a production campaign.

Customer Logos & Case Studies

We work with leading robot foundation model labs, humanoid startups, and academic research groups. Full case studies are available under NDA on request. Representative outcomes:

Collected 120,000 bimanual episodes for a Series-A humanoid startup in 8 weeks.
Delivered a 30,000-episode BridgeData-style WidowX pack for a VLA research group at a top-5 university.
Ran a 6-week dexterous grasping campaign on a customer-shipped robot hand, producing 8,000 tactile-annotated episodes.

Start with open data instead

Not sure you need custom collection yet? Start with one of the open datasets below — if you still need more coverage afterwards, come back here.

Open X-Embodiment — 1M+ cross-robot trajectories for VLA pretraining
BridgeData V2 — 60K WidowX demos with language
DROID — 76K Franka episodes across 564 scenes
LIBERO — 65K lifelong learning demos
CALVIN — long-horizon language-conditioned sim
ALOHA — bimanual real-world teleoperation
Robomimic — canonical BC benchmark
RoboNet — cross-robot video prediction
MimicGen — synthetic data augmentation

See the full datasets hub for an up-to-date directory.

Request a Quote

Tell us what you need. We respond with a fixed quote and a data collection plan within 24 hours.

Ready to collect data at scale?

Rent a robot to prototype in-house, browse our ready-to-license packs, or commission a fully custom collection campaign tuned to your target distribution.

Browse Store Rent Robot Get Custom Data