Custom Robot Teleoperation Data Collection

Spec the robot, the task list, the modalities, and the scale — we collect, label, QA, and deliver imitation learning data in the format your training stack expects. 24-hour teleoperation coverage at our Mountain View lab, from $500 per task episode.

Trusted at scale

10M+
episodes delivered to date
50+
trained teleoperators on staff
ISO
quality management system
7+
robot embodiments in rotation

Ready-to-License Dataset Packs

If you need imitation learning data today, start with one of our curated dataset packs. Each pack ships cleaned, labeled, and ready for OpenVLA / Octo / Diffusion Policy fine-tuning.

Bimanual Manipulation Pack

20,000 dual-arm episodes across 50 tabletop tasks. ALOHA 2 and bimanual Panda coverage, RGB + wrist cameras, language-annotated.

From $5,000

Humanoid Locomotion Pack

50,000 locomotion trajectories on Unitree G1/H1 across 30 terrains. Joint torques, IMU, full-body keypoints, RGB-D.

From $10,000

Dexterous Grasping Pack

10,000 multi-finger grasp trajectories on Shadow Hand and LEAP Hand across 200 objects. Tactile + force-torque + RGB-D.

From $5,000

VLA Evaluation Pack

2,000 held-out evaluation rollouts on Franka and WidowX, matched distribution to BridgeData V2 and LIBERO. Ideal for benchmarking a new VLA without contaminating your train set.

From $3,000

Commission a Custom Campaign

Every custom project follows a four-step flow. Typical total turnaround is 2-6 weeks depending on scale.

  1. Spec. You share a task brief — robot, objects, scenes, modalities, success criteria, delivery format. We return a detailed data collection plan and a fixed quote within 24 hours.
  2. Pilot. We collect ~100 episodes against your spec, train a sanity-check Diffusion Policy on the pilot data, and deliver raw episodes plus a validation report. You approve, tweak, or kill the spec.
  3. Scale. Once the pilot is approved, we run 24-hour teleoperation coverage to your target episode count with continuous QA and weekly drop deliveries.
  4. Delivery. Final dataset ships in RLDS, HuggingFace, LeRobot, or a custom schema with a CHANGELOG, per-episode metadata, and a reproducible training config.

Data Quality Standards

Every episode goes through a three-stage quality pipeline: automatic heuristic filters (action clipping, sensor dropout, episode length), manual review by a second teleoperator, and a final policy-based sanity check. Less than 0.5% of delivered episodes require post-hoc rework. See our full data services page for protocol details, inter-rater agreement numbers, and sample validation reports.

Licensing & Formats

You choose the delivery format:

  • RLDS / TFDS. Drop-in compatible with the Open X-Embodiment mix and the Octo / OpenVLA training scripts.
  • HuggingFace datasets. Parquet-backed, streamable, versioned, ready for a private HF repo.
  • LeRobot. The Hugging Face LeRobot schema for community-friendly sharing.
  • Custom schema. Your own HDF5 / Zarr / ROS bag layout — we match whatever your existing pipeline expects.

All custom-collected data is licensed to you exclusively by default. Non-exclusive licensing (to help offset collection cost) is available on request at a discount.

Pricing Guide

Task complexityPrice per episodeTypical use
Simple pick-and-place, 5s episodes$0.50 - $1VLA pretraining volume
Mid-complexity manipulation, 15-30s$1 - $3Standard imitation learning
Long-horizon, bimanual, or contact-rich$3 - $5Production policy fine-tuning
Humanoid / dexterous, full body$5 - $15Whole-body policy training
Novel embodiment (customer-shipped robot)$500 per task episodeBespoke R&D collection

Volume discounts kick in above 10,000 episodes. Full project budgets typically range from $5,000 for a pilot to $50,000+ for a production campaign.

Customer Logos & Case Studies

We work with leading robot foundation model labs, humanoid startups, and academic research groups. Full case studies are available under NDA on request. Representative outcomes:

  • Collected 120,000 bimanual episodes for a Series-A humanoid startup in 8 weeks.
  • Delivered a 30,000-episode BridgeData-style WidowX pack for a VLA research group at a top-5 university.
  • Ran a 6-week dexterous grasping campaign on a customer-shipped robot hand, producing 8,000 tactile-annotated episodes.

Start with open data instead

Not sure you need custom collection yet? Start with one of the open datasets below — if you still need more coverage afterwards, come back here.

  • Open X-Embodiment — 1M+ cross-robot trajectories for VLA pretraining
  • BridgeData V2 — 60K WidowX demos with language
  • DROID — 76K Franka episodes across 564 scenes
  • LIBERO — 65K lifelong learning demos
  • CALVIN — long-horizon language-conditioned sim
  • ALOHA — bimanual real-world teleoperation
  • Robomimic — canonical BC benchmark
  • RoboNet — cross-robot video prediction
  • MimicGen — synthetic data augmentation

See the full datasets hub for an up-to-date directory.

Request a Quote

Tell us what you need. We respond with a fixed quote and a data collection plan within 24 hours.

Ready to collect data at scale?

Rent a robot to prototype in-house, browse our ready-to-license packs, or commission a fully custom collection campaign tuned to your target distribution.