Why Cost Tracking Matters
Robot demonstration data is the most expensive input to machine learning, per useful data point, of any AI domain. Yet most teams building robot learning systems do not track the true all-in cost of their demonstrations — they estimate operator time, ignore overhead, and end up 3-5× over budget. This post builds a complete cost model from first principles.
Cost Component 1: Operator Labor
Operator labor is the largest variable cost. For simple pick-place tasks (L1-L2), an experienced operator can complete a demonstration in 2-4 minutes including setup and reset. For precision assembly (L3), 5-10 minutes per demonstration is typical. For dexterous manipulation (L4), 10-20 minutes.
At a fully-loaded operator cost of $25-45/hr (including employer overhead, benefits, and management):
| Task Type | Demo Duration | Cost at $30/hr |
|---|---|---|
| L1: Simple pick-place | 2–4 min | $1.00–2.00 |
| L2: Varied-pose pick-place | 4–8 min | $2.00–4.00 |
| L3: Contact-rich assembly | 8–15 min | $4.00–7.50 |
| L4: Dexterous manipulation | 15–25 min | $7.50–12.50 |
Cost Component 2: QA and Annotation
Raw demonstrations require QA (checking success, smoothness, completeness) and sometimes annotation (labeling subtask boundaries, object states, grasp types). At minimum, every demonstration requires 30-60 seconds of QA review — longer for failed episodes that need to be flagged.
At $15-25/hr for a QA reviewer: $0.12-0.42/demo for quick pass-fail review, $0.50-2.50/demo for detailed annotation with subtask labels. For datasets requiring rich annotation (grasp strategy labels, object state tracking), annotation cost can equal or exceed operator cost.
Cost Component 3: Hardware Amortization
A $50,000 robot arm (Franka Panda, new) amortized over 3 years at $45/day. If a lab runs 2 collection sessions per day at 20 demos per session: $45 / 40 = $1.12/demo in hardware amortization. For an OpenArm 101 ($8,000): $0.18/demo under the same assumptions. Wrist cameras ($200-800), depth cameras ($200-600), and the teleoperation system ($1,000-10,000 for glove + controller) add another $0.05-0.50/demo.
Cost Component 4: Software and Infrastructure
Cloud storage for raw video + sensor data: roughly $0.02/GB/month. A single 5-minute demonstration with 3 cameras at 30fps 720p generates about 2GB of raw data, or $0.04/month/demo. At a 12-month retention period: $0.48/demo in storage alone. Add compute for policy training: $0.05-0.20/demo amortized over the full training run.
Cost Component 5: Task Design and Setup
Task design (creating the task specification, building or sourcing prop objects, calibrating the workspace, writing the success classifier) is a one-time cost per task that amortizes across all demonstrations. A typical L2 task requires 8-16 hours of engineering setup ($200-600 at $25-40/hr). For a 500-demo collection, this adds $0.40-1.20/demo.
Total Cost Summary
| Cost Component | Simple Task (L2) | Complex Task (L3) |
|---|---|---|
| Operator labor | $2.00–4.00 | $5.00–10.00 |
| QA and annotation | $0.50–1.50 | $1.00–3.00 |
| Hardware amortization | $0.20–1.50 | $0.20–1.50 |
| Storage and compute | $0.50–1.00 | $0.50–1.00 |
| Task design (amortized) | $0.40–1.20 | $0.80–2.00 |
| Total per demo | $3.60–9.20 | $7.50–17.50 |
How SVRC Reduces Cost
SVRC reduces demonstration cost through three mechanisms: (1) volume pricing — sharing infrastructure across multiple clients reduces hardware and setup amortization, (2) trained operators — SVRC operators complete tasks 30-50% faster than new operators with fewer failed attempts, reducing labor cost per successful demonstration, (3) shared infrastructure — storage, QA tooling, and annotation pipelines are shared costs across clients rather than per-project overhead.
The result: SVRC data collection costs are typically 3-5× lower than in-house collection for labs without an established teleoperation infrastructure. For details on current pricing, see data services.