Define Your Teleoperation Use Case First
Teleoperation hardware that is perfect for one task is terrible for another. Before evaluating products, answer four questions that will eliminate most options:
Data collection vs. live remote operation? For training data, you optimize for operator comfort, demo quality, and throughput. For live operation (remote inspection, hazardous environments), you optimize for low latency and operator situational awareness. These requirements pull toward different hardware.
Bimanual or single-arm? Bimanual tasks (folding laundry, assembling boxes, opening jars) require either two coordinated operators, a single operator with dual controllers, or leader-follower hardware designed for bimanual use. Most cheap VR setups handle bimanual awkwardly. See our bimanual setup guide for details.
How much dexterity does your task require? Reaching into a bin and grasping a uniform object needs 6-DOF arm control and a simple gripper. Folding a shirt requires finger-level control. Be honest: most commercial manipulation tasks do not need gloves — a good parallel jaw gripper and a quality leader arm is faster to set up and produces higher-quality data.
What is your demo collection throughput target? To train a functional ACT policy on a single task, you typically need 50–200 demonstrations. At 2 minutes per demo and 70% success rate, that is 2–6 hours of operator time. At $150–$500/hour for a skilled operator, setup cost matters significantly.
Leader-Follower Systems
Leader-follower is the gold standard for arm teleoperation data quality. The operator holds a physical leader arm with the same kinematic structure as the follower robot, providing proprioceptive feedback. Data quality is consistently higher than VR-based systems for precision tasks.
| System | Price | Arms | DOF | ACT Compatible | Latency | Notes |
|---|---|---|---|---|---|---|
| ALOHA (Trossen) | $32,000 | Bimanual (2+2) | 6 per arm | Yes (native) | <5 ms | Reference platform for ACT paper; WidowX leaders + ViperX followers |
| Low-Cost ALOHA (DIY) | $20,000 | Bimanual | 6 per arm | Yes | <5 ms | Community build; requires assembly time |
| xArm Leader-Follower | $12,000 | Single arm | 6 | Yes (with adapter) | <10 ms | UFactory gravity-compensated leader arm |
| SO-100 (Lerobot) | $2,500 | Single arm | 6 | Yes (native) | <5 ms | Low-cost 3D-printed leader; growing community |
| GELLO (Custom) | $3,000 | Single arm | 7 | Yes | <5 ms | DIY gravity-comp leader, Franka-compatible |
The ALOHA system remains the most battle-tested leader-follower platform for imitation learning research. Its WidowX-250 leader arms use the same Dynamixel servo family as the ViperX-300 followers, providing near-transparent kinematic mapping. The $32K price includes 4 arms, mounting hardware, and the Interbotix SDK.
For single-arm tasks on a budget, the SO-100 from LeRobot is remarkable at $2,500. It requires 3D printing and assembly but produces data in the standard LeRobot HDF5 format compatible with ACT and Diffusion Policy training pipelines.
VR-Based Teleoperation Systems
VR teleoperation trades proprioceptive fidelity for lower hardware cost and flexibility across robot types. A VR system can control arms, humanoids, mobile manipulators, and novel platforms without buying a matched leader arm for each.
Meta Quest 3 ($500) is the dominant choice. The controller tracking at 6-DOF with <20 ms controller latency is sufficient for most manipulation tasks. The key requirement is end-to-end system latency under 150 ms — beyond that, operators cannot coordinate movements accurately. Sources of latency: camera capture (33 ms at 30 fps), video encode (5–15 ms), network (5–50 ms depending on setup), decode (5–10 ms), display (<20 ms on Quest 3). Keep your system local to achieve <100 ms total.
Common VR teleop software frameworks: AnyTeleop (open source, supports UR/xArm/Franka, ROS2-native), RoboTeleop (commercial, lower setup friction), OpenArm VR module (SVRC open-source, paired with OpenArm 101).
Limitations of VR teleop: no haptic feedback for contact forces means operators cannot feel grasp quality; operators tire faster without proprioceptive confirmation; wrist tracking loses accuracy at extreme joint angles; network dependency makes it unsuitable for high-precision tasks requiring <5 ms latency.
Glove Controllers for Dexterous Tasks
| Glove | Price (per hand) | DOF | Force Feedback | Robot Hand Compatible | Setup Time | Best For |
|---|---|---|---|---|---|---|
| SenseGlove Nova 2 | $4,000 | 5 finger + 6-DOF wrist | Yes (5-DOF force) | Shadow, Inspire, Unitree Dex | 30 min | Research, imitation learning |
| HaptX G1 | $10,000 | 5 finger + wrist | Yes (microfluidic) | Shadow Hand | 1–2 hr | Highest fidelity haptics |
| Inspire Hand Glove | $3,000 | 5 finger | No | Inspire RH56 (native) | 15 min | Cost-effective Inspire pairing |
| ROKOKO Smartglove | $500 | Finger position | No | Position mapping only | 5 min | Motion capture, animation |
| Dexmo (Dextarobotics) | $6,000 | 11-DOF per hand | Yes (5-DOF) | Multiple | 45 min | Full-hand force feedback |
Gloves are justified only if your robot hand can execute the same dexterity. Pairing a $4K SenseGlove with a parallel jaw gripper is wasteful — use a leader arm instead. Gloves make sense when you have a dexterous hand (Shadow, Inspire RH56, Unitree Dex3, Leap Hand) and tasks requiring finger-level control: in-hand manipulation, garment handling, small parts assembly. See our dedicated glove teleoperation guide for setup details.
Full-Service vs. DIY: Cost Comparison
| Approach | Cost per Demo | Setup Time | Data Quality | Turnaround | Data Format | IP Ownership |
|---|---|---|---|---|---|---|
| SVRC Full-Service | $25–$80 | None | High (QA reviewed) | 1–2 weeks | HDF5 / LeRobot / RLDS | 100% yours |
| DIY (researcher) | $5–$15 (labor only) | 4–12 weeks | Variable | Ongoing | Custom | 100% yours |
| Academic crowdsource | $8–$20 | 2–4 weeks setup | Variable | 3–6 weeks | Variable | Depends on agreement |
| Scale AI Robotics | $40–$150 | Contract process | High | 4–8 weeks | Custom | Verify contract carefully |
DIY is cheapest per-demo if you already have hardware, trained operators, and bandwidth. Full-service is cheapest when you factor in setup time and opportunity cost — a $50/demo rate for 200 demos ($10K) is often less expensive than 10 weeks of a senior researcher's time building and operating a teleop setup.
Provider Evaluation Checklist
- What robots do they operate? A provider running only one arm type gives you limited generalization data.
- What is their success/rejection rate? Demand the number. A 20% rejection rate means you pay for failed demos. SVRC targets <10% rejection with QA review on every episode.
- Do they provide sample data before commitment? Any credible provider can share 5–10 sample episodes in your target format before you sign.
- What data format do they deliver? HDF5, LeRobot Parquet, and RLDS are the standard formats. Proprietary formats are a red flag — you will be locked in.
- Who owns the data? Your contract must state that all collected data is your IP and the provider may not use it to train their own models. Non-negotiable.
- Can they handle your specific task? Ask for a task feasibility assessment. Some tasks require specialized hardware or workspace setup that not all providers support.