Build Guide · 2026

Cost of Building a Teleoperation Rig: Full Breakdown

Three teleop architectures, three price points. Bimanual ALOHA lands near USD 22,000 in parts, a VR-based setup can fit under USD 8,000, and a glove-based dexterous rig sits between the two. Here is the complete itemized cost breakdown, assembly time, and the turnkey alternative when you do not want to build from parts.

TL;DR. Bimanual ALOHA-style rig: ~USD 20k–25k in parts, 40–80 engineer-hours of build time. VR teleop (Quest 3 + commercial arm bridge): USD 3k–8k of new hardware beyond the arm. Glove-based dexterous teleop (Manus/Rokoko): USD 8k–15k. Add USD 10k–15k for a Mobile ALOHA base. If build time is the constraint, a turnkey rig from SVRC saves months.

Three teleoperation architectures

Before you spend anything, pick the architecture that fits your research agenda. The three popular options as of early 2026 are:

  • Leader-follower bimanual (ALOHA). Two small leader arms driven by hand, each joint-copied to a matching follower arm on a shared task surface. Best for high-fidelity bimanual manipulation data collection.
  • VR controller-based. A Meta Quest 3 (or similar HMD) provides hand pose and button events, mapped to the end-effector of a commercial arm such as a Franka FR3, UR5e or OpenArm. Good for unimanual workflows and immersive operator feedback.
  • Glove-based dexterous. A Manus or Rokoko motion-capture glove streams finger-level pose to a dexterous hand (ORCA, Allegro, Inspire). Required when your research depends on individual-finger control.

Most labs end up with two of these on the same bench, not one. Plan for that.

Option A: Bimanual ALOHA-style rig ($20k–$25k)

ALOHA (A Low-cost Open-source Hardware System for Bimanual Teleoperation) has become the default data-collection architecture for imitation learning in 2024–2026. A faithful build uses four Trossen WidowX 250 arms — two as leaders, two as followers — and four USB cameras on a shared task surface.

Itemized breakdown — bimanual ALOHA rig

ComponentQtyUnit cost (USD)Subtotal
WidowX 250 leader arm2$4,000$8,000
WidowX 250 follower arm (with gripper)2$4,000$8,000
Logitech Brio 4K USB camera4$200$800
Intel RealSense D435 (optional, depth)1$450$450
Workstation PC (RTX 4090, 64 GB RAM)1$3,500–$5,000$3,500–$5,000
Powered USB 3 hub (10+ port)1$80$80
80/20 aluminum extrusion frame + mounting plates1 set$400–$800$400–$800
Task-surface table (36×24 in)1$150–$400$150–$400
Cabling, zip ties, Velcro, labels1 kit$120$120
Foot pedal (episode start/stop)1$60$60

Cost summary — ALOHA rig

Low$20,000Parts only, self-build
Typical$22,500Full build + workstation
High$28,000Turnkey + depth camera

Lead time for WidowX arms is typically 4–8 weeks from the manufacturer; SVRC carries stock on several configurations and can often ship faster. See our bimanual teleop hardware setup guide for the reference wiring diagram. Plan 40–80 hands-on engineer hours for the build, most of which is frame assembly, calibration and cable management.

Option B: VR-based teleoperation ($3k–$8k beyond the arm)

The cheapest path to teleoperation in 2026 is a Meta Quest 3 ($500) wired (or streamed) to a workstation running an open-source VR-to-arm bridge. You pay very little for the teleoperation hardware itself; the arm and workstation are where the cost lives.

Itemized breakdown — VR teleop rig

ComponentQtyUnit cost (USD)Subtotal
Meta Quest 3 headset1$500$500
SteamVR base stations (optional, for pose anchoring)2$150$300
Vive trackers or HTC trackers (optional)2$130$260
Workstation PC1$2,500–$4,000$2,500–$4,000
Arm mount + task surface + adapters1$300–$800$300–$800
USB cameras (Brio, 2x)2$200$400
Networking / router for low-latency link1$100–$300$100–$300
Low$3,000Quest 3 + minimal setup
Typical$5,500With trackers + workstation
High$8,000Full SteamVR + dual cameras

Add this on top of the arm cost (Franka FR3 ~USD 30k, UR5e ~USD 25k, OpenArm at open-source vs commercial platforms TCO guide.

Option C: Glove-based dexterous teleoperation ($8k–$15k)

If your research agenda requires individual-finger control — dexterous in-hand manipulation, tool use, or whole-hand grasping studies — a glove is the only option. Popular 2026 gloves are Manus Quantum Metagloves (USD 2,500–$5,000 per pair depending on configuration) and Rokoko Smartgloves (USD 2,500–$4,000 per pair). These stream fingertip and joint pose at sub-10ms latency.

Itemized breakdown — glove-based dexterous rig

ComponentQtyUnit cost (USD)Subtotal
Manus or Rokoko glove (pair)1$2,500–$5,000$2,500–$5,000
Dexterous robot hand (Allegro, Inspire, ORCA)1 pair$3,000–$8,000$3,000–$8,000
SteamVR base stations for wrist pose2$150$300
Workstation PC1$2,500–$4,000$2,500–$4,000
USB/PoE cameras3$200$600
Wrist-mount adapters and cable harness1$200–$500$200–$500
Low$8,500Entry glove + one hand
Typical$12,000Full bimanual dexterous
High$15,000Premium glove + Allegro hands

See our glove-based dexterous teleoperation guide for a walkthrough on calibration and latency budgets.

Mobile ALOHA add-on ($10k–$15k on top)

When the research task requires mobility — opening a fridge, moving between stations, navigating a home kitchen — the Mobile ALOHA add-on mounts the bimanual rig on a small wheeled base. The reference base is the AgileX Tracer, available through SVRC for approximately USD 10,000–$15,000 depending on configuration.

ComponentUnit cost (USD)
AgileX Tracer mobile base$10,000–$15,000
Additional LiDAR for base (2D or 3D)$800–$3,500
Onboard battery extension$500–$1,500
Re-mount kit for ALOHA arms onto base$400–$1,200

Our Mobile ALOHA setup guide covers the mechanical and electrical integration; our Mobile ALOHA at the SVRC Store is a turnkey option.

Step-by-step: build a bimanual ALOHA rig

  1. Order parts. Place the WidowX arms order first; it has the longest lead time. Order cameras and workstation in parallel.
  2. Build the frame. Cut and assemble 80/20 extrusion for a shared task surface with mount plates for followers and a separate operator bench.
  3. Mount arms and cameras. Install two follower arms facing forward on the task surface, two leader arms on the operator bench, and three to four cameras (2 wrist + 1 front + 1 top).
  4. Wire and calibrate. Connect each arm’s USB to a powered hub into the workstation. Calibrate joint zero positions. Confirm sub-20ms end-to-end latency on the leader-follower loop.
  5. Run first data session. Record 30 episodes of a simple task. Export to LeRobot or HDF5. Train a small ACT policy and deploy it back to validate the loop.

See our camera setup for teleoperation and data collection guides for detail.

Build vs buy vs lease

Our rough rule: if the team has one ROS-literate integrator with 80 free hours, build. If not, buy turnkey. A turnkey SVRC ALOHA typically ships at a 20–30 percent premium over raw parts but saves 60–120 hours of integration. At a USD 150/hr loaded engineer rate, turnkey pays for itself almost immediately.

For short projects or for evaluation before commitment, leasing an ALOHA from SVRC starts around USD 1,500/month and includes support. For custom teleoperation data collection as a service, see data services.

Latency, throughput and where the rig limits your research

After the hardware is on the bench, the single most important quality metric for a teleoperation rig is end-to-end latency. Leader-follower rigs like ALOHA can hit 10–20 ms round-trip from operator joint motion to follower response, which feels immediate and enables subtle manipulation. VR-based rigs with a USB-tethered Quest 3 typically hit 20–35 ms. Glove-based rigs hit 15–25 ms on the hand pose alone but wrist pose from SteamVR adds another 10 ms. Above roughly 40 ms, fine manipulation tasks begin to suffer.

A second limit is throughput: how many episodes can a single operator collect in an hour. In practice, ALOHA operators typically produce 30–60 episodes per hour for short tasks (pick and place) and 10–20 for longer tasks (folding, tool use). At 40 episodes/hour averaged across a 6-hour shift, one operator generates 240 episodes per day. A thousand-episode dataset — typical for a small VLA fine-tune — takes about four to five operator-days. This is the right framing for budgeting data-collection programs; see our data collection guide for details.

A third limit is operator ergonomics. A badly arranged leader-follower bench will burn out operators in a few hours. Our recommended layout puts leader arms on an adjustable-height bench, followers at a matched height on the task surface, and a chair with good lumbar support; budget USD 600–$1,200 for operator furniture that you will not regret.

Common mistakes that blow the budget

  • Underspeccing the workstation. A cheap PC without enough USB bandwidth will drop frames and corrupt your training data silently. Always use a powered USB 3 hub per arm and test at full camera frame-rate before collecting real episodes.
  • Buying one-off cameras. Mixed camera brands cause synchronization headaches. Buy four identical cameras from the same manufacturing batch.
  • Skipping the frame. 80/20 extrusion feels optional but without it the arms drift relative to each other and the task surface across weeks of use.
  • Ignoring lead time. WidowX arm orders of 4+ can delay 8–12 weeks. Place the order first and build in parallel.
  • Hand-rolling software. Use LeRobot or the ALOHA reference code rather than writing your own teleoperation bridge — the community implementations handle edge cases you do not want to rediscover.

Cost per collected episode: a useful TCO frame

Divide your total rig TCO by the expected number of collected episodes to get cost per episode. For a USD 22,500 ALOHA build amortized over 5,000 episodes (a realistic first-year output), the hardware cost per episode is USD 4.50. Add the operator cost (at USD 30/hr loaded and 40 episodes/hr, USD 0.75/episode) and total cost per episode lands near USD 5. For a SO-100-based micro-rig, cost per episode can fall below USD 2. For a Mobile ALOHA build, it can rise above USD 15. This framing is what the larger AI labs use internally to budget data programs; use it to sanity-check any teleop purchase before signing the PO.

Frequently asked questions

How much does a bimanual teleoperation rig cost to build?

About USD 20,000–$25,000 in parts for an ALOHA-style rig. VR rigs can be done for USD 3,000–$8,000 beyond the arm; glove rigs at USD 8,000–$15,000.

Is it cheaper to buy a pre-built rig?

Turnkey typically runs a 20–30% premium but saves 60–120 engineer-hours. At USD 150/hr it break-evens almost immediately.

What is Mobile ALOHA and what does it add?

Adds an AgileX Tracer mobile base under the bimanual rig. USD 10k–$15k on top of a standard ALOHA.

What cameras should I use?

Three to four Logitech Brio 4K USB cameras ($200 each) for standard work; add a RealSense D435 for depth-aware policies.

How long does it take to build?

40–80 engineer hours of hands-on time plus 4–8 weeks of arm lead time. Order arms first.

Can I use VR controllers for teleoperation?

Yes. Quest 3 + open-source bridge is the cheapest path but trades off dexterity vs. leader-follower setups.

What is the minimum workstation?

For data capture: RTX 4060, 32 GB RAM. For training on the same box: RTX 4090 or 5090, 64+ GB RAM.

Next steps

Ready to buy a turnkey rig? ALOHA at the SVRC Store. Want a mobile variant? Mobile ALOHA. Need someone else to run the data collection? SVRC teleop data services. Hedging your commitment? Lease a rig first. Related reading: research humanoid cost, open-source vs commercial TCO, leasing vs buying ROI.