Why Cost Tracking Matters

Robot demonstration data is the most expensive input to machine learning, per useful data point, of any AI domain. Yet most teams building robot learning systems do not track the true all-in cost of their demonstrations -- they estimate operator time, ignore overhead, and end up 3-5x over budget. This post builds a complete cost model from first principles.

Understanding cost per demonstration is not just an accounting exercise. It determines whether to build your data collection capability in-house or outsource it, whether to invest in simulation to reduce real-data requirements, how to prioritize data collection across multiple tasks, and when to stop collecting and start deploying. The difference between a $5/demo and a $20/demo cost structure is the difference between a 500-demo dataset costing $2,500 and costing $10,000 -- budget-breaking for academic labs and margin-destroying for startups.

Task Complexity Categories

Not all demonstrations are equal. Task complexity is the single largest determinant of per-demonstration cost. We use a four-level taxonomy based on SVRC's operational experience across hundreds of data collection campaigns:

LevelDescriptionExamplesTypical Demo Count Needed
L1: Simple pick-placeSingle object, fixed pose, no contact precisionPick cup from center, place on tray50-100
L2: Varied pick-placeMultiple objects, varied poses, moderate precisionSort 5 object types into bins200-500
L3: Contact-rich assemblyMulti-step, force-sensitive, tight tolerancesPeg-in-hole, connector mating, stacking500-1000
L4: Dexterous manipulationMulti-finger, deformable objects, long horizonCloth folding, cable routing, in-hand rotation1000-5000

Cost Component 1: Operator Labor

Operator labor is the largest variable cost. For simple pick-place tasks (L1-L2), an experienced operator can complete a demonstration in 2-4 minutes including setup and reset. For precision assembly (L3), 5-10 minutes per demonstration is typical. For dexterous manipulation (L4), 10-20 minutes.

At a fully-loaded operator cost of $25-45/hr (including employer overhead, benefits, and management):

Task TypeDemo DurationCost at $30/hrFailure RateEffective Cost (incl. failures)
L1: Simple pick-place2-4 min$1.00-2.005-10%$1.10-2.20
L2: Varied-pose pick-place4-8 min$2.00-4.0010-20%$2.40-5.00
L3: Contact-rich assembly8-15 min$4.00-7.5015-30%$5.30-10.70
L4: Dexterous manipulation15-25 min$7.50-12.5025-40%$12.50-20.80

The "Effective Cost" column is critical and often overlooked. Failure rate -- the percentage of attempts that must be discarded because the operator failed the task, the robot faulted, or the recording was corrupted -- is a direct multiplier on labor cost. For L3/L4 tasks, 25-40% failure rates are common even with experienced operators, which means you pay for 130-170 attempts to get 100 successful demonstrations.

Operator Learning Curves: Demo 1-10 vs. 50-100 vs. 200+

Operator efficiency is not constant. SVRC tracks operator throughput across thousands of collection sessions, and the learning curve follows a consistent pattern:

  • Demos 1-10 (familiarization): Operators complete tasks at 40-60% of peak throughput. Failure rates are 2-3x the steady-state rate. Demonstrations tend to be slower, less smooth, and more variable in strategy. These early demos are often low quality and may need to be excluded from training sets. Budget 2x the steady-state cost per demo for this phase.
  • Demos 10-50 (skill building): Throughput rises to 70-85% of peak. Failure rates decline to 1.5x steady state. Operators develop consistent strategies and their demonstrations become smoother. Data quality is acceptable for training.
  • Demos 50-100 (proficiency): Operators reach steady-state throughput. Failure rates stabilize. Demonstration quality is consistent and suitable for production datasets.
  • Demos 200+ (expertise): Experienced operators on familiar tasks achieve peak throughput with minimal cognitive load. However, fatigue effects become the dominant performance factor for sessions longer than 45-60 minutes. The quality advantage of expert operators over novices on complex tasks (L3/L4) is substantial: 30-50% faster completion with 40-60% fewer failures.

The practical implication: if you are building an in-house data collection capability, plan for a 2-4 week ramp period before operators reach production efficiency. The demos collected during ramp-up are usable but cost more per successful episode. SVRC's operators are already past this curve, which is a significant component of the cost advantage of outsourced collection.

Batch Efficiency: The Hidden Multiplier

Demo cost drops significantly when collected in larger batches due to fixed overhead amortization:

Batch SizeSetup Time AmortizedEffective Cost Multiplier vs. Batch of 500
20 demos$15-25 per demo3-5x
50 demos$8-12 per demo1.8-2.5x
200 demos$5-8 per demo1.2-1.5x
500+ demos$4-6 per demo1.0x (baseline)

The fixed costs that drive this curve: task design and specification (4-16 hours of engineering), workspace setup and calibration (2-4 hours), operator training on the specific task (1-4 hours), and QA pipeline configuration (2-4 hours). These costs are identical whether you collect 20 or 500 demonstrations.

Cost Component 2: QA and Annotation

Raw demonstrations require QA (checking success, smoothness, completeness) and sometimes annotation (labeling subtask boundaries, object states, grasp types). At minimum, every demonstration requires 30-60 seconds of QA review -- longer for failed episodes that need to be flagged.

At $15-25/hr for a QA reviewer: $0.12-0.42/demo for quick pass-fail review, $0.50-2.50/demo for detailed annotation with subtask labels. For datasets requiring rich annotation (grasp strategy labels, object state tracking, language instruction writing), annotation cost can equal or exceed operator cost. See our annotation challenges guide for detailed annotation cost breakdowns by type.

Automated annotation tools reduce but do not eliminate human review cost. A success classifier (trained on 100+ labeled examples) can auto-label 85-92% of episodes correctly, reducing human QA review to borderline cases. SVRC's automated QA pipeline cuts human review time by approximately 60% for standard manipulation tasks.

Cost Component 3: Hardware Amortization

A $50,000 robot arm (Franka Panda, new) amortized over 3 years at $45/day. If a lab runs 2 collection sessions per day at 20 demos per session: $45 / 40 = $1.12/demo in hardware amortization. For an OpenArm 101 ($4,500): $0.10/demo under the same assumptions. Wrist cameras ($200-800), depth cameras ($200-600), and the teleoperation system ($1,000-10,000 for glove + controller) add another $0.05-0.50/demo.

Hardware ItemPurchase PriceAmortization (3yr, 40 demos/day)Cost/Demo
OpenArm 101$4,500$4.11/day$0.10
Franka Emika Panda (used)$25,000$22.83/day$0.57
Franka Emika Panda (new)$40,000$36.53/day$0.91
2x RealSense D435i cameras$600$0.55/day$0.01
Leader arm (teleop)$2,000-8,000$1.83-7.31/day$0.05-0.18
F/T sensor (ATI Mini45)$5,000$4.57/day$0.11
GPU workstation (RTX 4090)$4,000$3.65/day$0.09

Cost Component 4: Software and Infrastructure

Cloud storage for raw video + sensor data: roughly $0.02/GB/month. A single 5-minute demonstration with 3 cameras at 30fps 720p generates about 2GB of raw data, or $0.04/month/demo. At a 12-month retention period: $0.48/demo in storage alone. Add compute for policy training: $0.05-0.20/demo amortized over the full training run.

Additional infrastructure costs that teams often miss: network bandwidth for uploading data to cloud training servers ($0.01-0.05/demo), backup storage ($0.01-0.02/demo), and the engineering time to maintain the recording, storage, and retrieval pipeline. If you are using the SVRC data platform, these infrastructure costs are bundled into the service.

Cost Component 5: Task Design and Setup

Task design (creating the task specification, building or sourcing prop objects, calibrating the workspace, writing the success classifier) is a one-time cost per task that amortizes across all demonstrations. A typical L2 task requires 8-16 hours of engineering setup ($200-600 at $25-40/hr). For a 500-demo collection, this adds $0.40-1.20/demo.

Complex tasks (L3/L4) require proportionally more design effort: specifying force thresholds, designing fixtures, sourcing multiple object variants for diversity, and iterating on the task specification to achieve a consistent success definition. Budget 20-40 hours ($500-1,600) for L3 task design and 40-80 hours ($1,000-3,200) for L4.

Total Cost Summary

Cost ComponentSimple Task (L2, 500 demos)Complex Task (L3, 500 demos)
Operator labor (incl. failures)$2.40-5.00$5.30-10.70
QA and annotation$0.50-1.50$1.00-3.00
Hardware amortization$0.20-1.50$0.30-1.80
Storage and compute$0.50-1.00$0.50-1.00
Task design (amortized)$0.40-1.20$1.00-3.20
Total per demo$4.00-10.20$8.10-19.70
Total dataset cost (500 demos)$2,000-5,100$4,050-9,850

Cost Component 6: Compute for Training and Evaluation

Training compute is often treated as negligible compared to data collection, and for most practical robot learning projects this is correct. However, at scale or with larger models, compute costs become material and should be tracked.

Training ScenarioGPU RequiredTimeCloud Cost (on-demand)Cost/Demo (500 demo dataset)
ACT, single task1x RTX 3090/40902-4 hrs$5-10$0.01-0.02
Diffusion Policy, single task1x RTX 3090/40904-8 hrs$10-20$0.02-0.04
Octo fine-tune1x A100 80GB6-12 hrs$20-40$0.04-0.08
OpenVLA fine-tune (7B)4x A100 80GB12-48 hrs$200-800$0.40-1.60
Hyperparameter sweep (10 runs)1x RTX 409020-40 hrs$50-100$0.10-0.20

The data-to-compute cost ratio reinforces the strategic point: for ACT and Diffusion Policy, compute is < 1% of total project cost. Even for VLA fine-tuning, compute is 5-15% of total cost. The overwhelming majority of your budget should be allocated to data collection quality and diversity, not to larger models or more training iterations. See our scaling laws analysis for the performance implications of this cost structure.

Storage and Data Management Costs

Robot demonstration data generates significant storage volume. Each episode contains multi-camera video (3-5 cameras at 640x480, 30fps), joint state (14-30 DOF at 50Hz), and optional F/T sensor readings (6D at 50Hz). The storage cost per episode and per campaign scales as follows:

Data ComponentPer Episode (30s)500 EpisodesMonthly Cloud Storage
2 cameras (H.264 compressed)30-60 MB15-30 GB$0.35-0.70
4 cameras (H.264 compressed)60-120 MB30-60 GB$0.70-1.40
Raw images (uncompressed, for training)200-400 MB100-200 GB$2.30-4.60
Proprioception + F/T0.5-2 MB0.25-1 GB$0.01-0.02
Total (4 cameras + state)60-122 MB30-61 GB$0.70-1.42

Cloud storage costs (at $0.023/GB/month for S3 Standard) are negligible for most projects. The real storage cost is data transfer bandwidth during training: downloading 60 GB from cloud storage for every training run can take 30-60 minutes on a typical internet connection. For teams running frequent hyperparameter sweeps (10+ training runs per iteration), consider local NVMe storage ($100-200 for 2TB) to eliminate this bottleneck.

SVRC's data platform stores all collected datasets on fast-access cloud storage with CDN caching. Datasets can be downloaded in HDF5 or RLDS format, and large datasets (>100 GB) are available for direct-attached access via our training cluster to avoid download overhead entirely.

SVRC vs. DIY: Detailed Cost Comparison

To make the build-vs-buy decision concrete, here is a side-by-side cost comparison for a typical 500-demo L2 task dataset.

Cost ItemDIY (first project)DIY (nth project)SVRC Service
Hardware purchase$8,000-55,000$0 (already owned)$0 (included)
Pipeline development$25,000-65,000$0 (already built)$0 (included)
Operator training$2,000-5,000$500-1,000$0 (trained operators)
Task design + calibration$400-1,200$400-1,200Included in campaign
Data collection (500 demos)$2,500-5,000$2,000-4,000$6,000-8,000
QA and annotation$500-1,500$500-1,500Included
Time to first data8-16 weeks1-2 weeks2-4 weeks
Total first-project cost$38,400-132,700$3,400-7,700$6,000-8,000

The economics are clear: for a first project, outsourcing saves $30,000-125,000 and 6-14 weeks. For subsequent projects where the DIY infrastructure already exists, in-house collection becomes cost-competitive. The breakeven is typically at 3-5 data collection campaigns. SVRC's $2,500 pilot is designed specifically for teams validating whether imitation learning works for their task before committing to infrastructure investment.

Reducing Cost: Practical Strategies

Based on SVRC's experience across hundreds of campaigns, these strategies have the highest impact on reducing per-demo cost:

  • Use foundation models to reduce demo requirements (30-50% cost reduction). Fine-tuning OpenVLA or Octo requires 100-200 demos to reach the same performance as training ACT from scratch on 500 demos. At $5-10/demo, using a foundation model saves $1,500-3,000 in data collection cost -- more than offsetting the higher compute cost of VLA fine-tuning.
  • Invest in operator training before scaling collection (20-30% cost reduction). A single day of structured operator training on your specific task reduces per-demo time by 25-35% and failure rate by 30-50%. This front-loaded investment pays for itself within the first 50 demos.
  • Automate QA with a trained success classifier (15-25% cost reduction). Train a binary classifier on 100+ labeled episodes to auto-filter obvious failures. Human reviewers then only examine borderline cases (typically 10-20% of episodes). SVRC provides pre-trained success classifiers for common task types.
  • Batch similar tasks (10-20% cost reduction). Collecting data for 5 related tasks in a single campaign (e.g., pick 5 different objects) shares the setup, calibration, and operator training costs across tasks, reducing per-task overhead by 60-80%.
  • Use data augmentation to amplify real demos (variable savings). Color jitter, random crop, and spatial augmentation effectively multiply your dataset diversity without collecting additional real data. These augmentations are free and should always be applied. Expected impact: equivalent to collecting 20-40% more real data.

Build vs. Buy: When to Outsource Data Collection

The decision to build in-house data collection capability vs. outsourcing to a service like SVRC depends on your timeline, scale, and team composition.

Build in-house when: you plan to collect data continuously for 6+ months, your tasks require deep domain expertise that is hard to transfer, you have existing robotics engineering staff, or your data involves proprietary objects/processes that cannot leave your facility. In-house becomes cost-effective at approximately 2,000+ demos per year when the fixed costs (hardware, operator training, pipeline development) are spread across enough volume.

Outsource when: you need data in the next 2-8 weeks (SVRC typical turnaround: 2-4 weeks for a 500-demo campaign), you are collecting for the first time and do not want to invest in infrastructure before validating your approach, your team is ML/research-focused without teleop operations experience, or you need to scale up quickly for a specific project. SVRC's data collection service starts at $2,500 for a pilot campaign (50-100 demos) and $8,000 for a full campaign (200-500 demos).

Hybrid model (most popular): Many SVRC clients start with outsourced data collection for their first 1-2 tasks, then transition to in-house collection for subsequent tasks using the protocols and tooling established during the outsourced phase. SVRC supports this transition explicitly: all collection protocols, QA rubrics, operator training materials, and pipeline configurations are delivered to the client as part of every engagement. This "teach a team to fish" approach means the outsourced engagement has lasting value beyond the immediate dataset deliverable.

The hidden cost of building in-house: most teams underestimate the engineering time to build a reliable recording and QA pipeline. Count on 2-4 person-months of software engineering to build data recording, synchronization, storage, QA review UI, and export tooling from scratch. At $150-200K/yr fully-loaded engineering cost, that is $25,000-65,000 in pipeline development before collecting a single demonstration.

Amortization Across Multiple Tasks: The Economics of Shared Infrastructure

One of the least appreciated cost reduction strategies is amortizing infrastructure across multiple tasks collected on the same hardware. When a team collects data for 10 tasks on the same robot over 3 months, the per-task cost drops dramatically compared to 10 separate single-task projects.

Fixed Cost Item1 Task (500 demos)5 Tasks (2,500 demos total)10 Tasks (5,000 demos total)
Hardware amortization per demo$0.80-1.50$0.16-0.30$0.08-0.15
Pipeline development per demo$5.00-13.00$1.00-2.60$0.50-1.30
Operator training per demo$0.40-1.00$0.16-0.40$0.08-0.20
Total fixed cost per demo$6.20-15.50$1.32-3.30$0.66-1.65

The 10-task scenario reduces fixed costs per demo by 10x compared to the single-task scenario. Combined with variable cost improvements from operator expertise (they get faster across tasks), the total cost per demo in a mature multi-task collection operation can be as low as $3-6 per demo for L2 tasks -- approaching the theoretical floor set by operator labor alone.

This amortization effect is the economic reason why centralized data collection services exist. SVRC amortizes its hardware, pipeline, and operator training costs across dozens of concurrent client projects, achieving per-demo costs that no single lab can match on its own for the first 2-3 projects.

Cost Modeling for Foundation Model Fine-Tuning

The rise of foundation models (Octo, OpenVLA) changes the cost equation for data collection. Because foundation models require fewer task-specific demonstrations, the per-project data collection budget shrinks, but the compute and infrastructure costs shift. Here is how the total cost changes.

ApproachDemos NeededData CostCompute CostTotal CostTarget Success Rate
ACT from scratch (L2 task)400-500$2,400-5,000$5-15$2,405-5,01588-93%
Octo fine-tune (L2 task)150-200$900-2,000$20-40$920-2,04086-92%
OpenVLA fine-tune (L2 task)100-200$600-2,000$200-800$800-2,80088-95%
ACT from scratch (L3 task)800-1000$6,400-20,000$10-30$6,410-20,03080-88%

The key insight: foundation model fine-tuning reduces total project cost by 40-60% primarily by reducing demo count requirements, not by reducing per-demo cost. The compute cost increase (from $5-15 for ACT to $200-800 for OpenVLA) is trivial compared to the data collection savings ($1,500-3,000 fewer dollars in operator time). This is why SVRC recommends foundation model fine-tuning as the default approach for new projects: it produces better results at lower total cost.

ROI Calculator: When Does Robot Learning Pay for Itself?

For teams building a business case, the return on investment from robot learning depends on what the robot replaces and how reliably it performs the task.

Deployment ScenarioData Collection InvestmentManual Labor Replaced (per year)Payback Period
Simple bin picking, 1 shift/day$3,000-5,000$25,000-40,0002-3 months
Multi-SKU kitting, 2 shifts/day$8,000-15,000$50,000-80,0002-4 months
Precision assembly, 1 shift$15,000-30,000$35,000-55,0005-10 months
Lab sample handling, 24/7$5,000-10,000$80,000-120,0001-2 months

These estimates assume the robot achieves 90%+ success rate on the deployment task (requiring human intervention for the remaining 10%). For high-volume, repetitive tasks (bin picking, sample handling), the ROI is compelling even at modest success rates because the volume of manual labor replaced is large. For low-volume, high-precision tasks (custom assembly), the ROI depends more on whether the robot can handle the task at all -- the binary threshold matters more than marginal improvement in success rate.

Seasonal and Project-Based Cost Optimization

Data collection costs are not constant over time. Understanding the temporal dynamics allows teams to optimize when and how they collect data.

  • Batch large collections during off-peak periods. SVRC offers lower rates for scheduled collection during weeks without competing client campaigns (typically January-February and July-August). Planning collection 4-6 weeks ahead can reduce costs by 10-15%.
  • Front-load diversity, back-load volume. The first 100 demos should maximize diversity (many objects, many positions, varied lighting). The next 100-400 can focus on filling gaps identified during initial policy training. This progressive approach typically reaches deployment-grade performance 20-30% faster than uniform collection, saving calendar time and reducing total demos needed.
  • Amortize task design across similar tasks. If you have 5 pick-and-place variants (different objects on the same workspace), design the task specification once and collect all 5 variants in a single campaign. The task design cost ($200-600 per task) drops to $40-120 per task when shared across 5 variants.
  • Reuse workspace setups for diversity. Moving objects to new positions between demos is free; moving the camera or changing the table takes 15-30 minutes. Design your diversity protocol to vary what is cheap to vary (object positions, object instances) more frequently than what is expensive (camera angles, lighting, workspace layout). A well-designed protocol achieves D>200 diversity score without any workspace reconfiguration.

Hidden Costs Most Teams Miss

Beyond the direct cost components, several indirect costs frequently surprise teams building their first data collection operation.

  • Object procurement ($200-2,000 per task). Diverse training data requires diverse objects. A pick-and-place task with 20 different cup instances requires sourcing 20 different cups that span the geometric, material, and visual variety you want the policy to handle. For specialized domains (medical devices, electronics components), procurement costs can exceed $1,000.
  • Workspace reset time (10-30% of total session time). Between demonstrations, the workspace must be reset to a valid starting configuration. For simple tasks, reset takes 5-10 seconds. For multi-object scenes, reset takes 30-60 seconds. This "dead time" between demos is pure overhead that operators cannot avoid. Factor it into throughput estimates.
  • Data quality iteration (2-5 collection-training cycles per task). Rarely does the first batch of collected data produce an acceptable policy. The typical workflow involves: collect 100 demos, train, evaluate, identify failure modes, refine collection protocol, collect 100-200 more targeted demos, retrain. Budget 2-5 iterations, each adding variable cost and 1-2 weeks of calendar time.
  • Hardware downtime (5-15% of planned collection time). Servo overheating, camera disconnections, USB bus resets, and gripper calibration drift cause unplanned downtime. For ALOHA-class systems, budget 10-15% of planned collection time for troubleshooting. For industrial arms (Franka, UR), budget 5-8%. This downtime is not free -- operators are idle or debugging, and the facility is occupied.

Cost Optimization Strategies: Getting More Value per Dollar

Regardless of whether you collect data in-house or through a service, these strategies reduce the effective cost per useful demonstration.

1. Batch by task similarity. Schedule collection of related tasks in the same session. Picking mugs and picking bottles use similar workspace setups and operator skills. Switching between related tasks within a session avoids the workspace reconfiguration overhead that can consume 30-60 minutes between task types. SVRC groups client tasks by workspace similarity to minimize changeover time.

2. Use progressive collection. Do not commit to 500 demonstrations on day one. Collect 50, train, evaluate. If the policy reaches 80% success, the remaining 450 demos may not be needed. The scaling laws evidence shows diminishing returns past 200-300 demos for single-task BC. Collect in batches of 50-100 and evaluate between batches.

3. Maximize diversity per demo. Each demonstration should cover a unique combination of object position, orientation, and lighting. Operators who vary the setup systematically between demos produce datasets where 200 diverse demos outperform 1,000 homogeneous demos. Add diversity checklists to the operator protocol: "move object to position 3, rotate 90 degrees, adjust lamp to position B."

4. Invest in operator training. A 4-hour operator training session (cost: $200-400) reduces the failure rate by 30-50% and increases throughput by 20-30%. This pays for itself within the first 200 demos collected. Training should cover: smooth teleoperation technique, consistent task completion strategy, systematic diversity variation, and quality self-assessment.

5. Automate QA where possible. Automated success/failure classifiers (see annotation challenges) can pre-filter episodes, reducing the human annotation burden by 50-70%. At $2-5 per manually annotated episode, this saves $1-3 per demo in annotation cost.

Total Cost of Ownership: DIY vs. SVRC Comparison

Cost CategoryDIY (First Project)DIY (Fifth Project)SVRC Service
Hardware (1 arm + teleop)$5,000-25,000$0 (already owned)Included
Pipeline setup (software)$2,000-6,000$200-500Included
500 demos (L2 task)$5,000-10,000$3,000-5,000$4,000-6,000
Annotation + QA$1,000-2,500$500-1,000Included
Calendar time to data6-12 weeks2-4 weeks1-3 weeks
Total first-project cost$13,000-43,500$3,700-6,500$4,000-6,000

Time value of money: The calendar time difference is often more important than the dollar difference. A team that gets data in 2 weeks (via SVRC) rather than 12 weeks (DIY first project) can start training policies 10 weeks sooner. For startups and funded research projects where time-to-result directly affects funding decisions, the faster path has value beyond the direct cost comparison.

The crossover point: for your first 1-3 projects, SVRC is 2-7x cheaper than DIY because you avoid the upfront hardware and pipeline costs. By your 5th project, DIY costs approach SVRC rates. The ongoing advantage of SVRC at scale is operator expertise (faster collection, fewer failures) and infrastructure maintenance (camera calibration, software updates, hardware repairs are our responsibility, not yours).

Data Collection Infrastructure: Build vs. Buy Decision Framework

The build-vs-buy decision for data collection infrastructure depends on three factors: expected project volume (number of tasks over the next 12 months), team robotics expertise, and timeline pressure. This decision tree reflects the patterns we see across SVRC clients.

ScenarioExpected Tasks (12 mo)Team BackgroundRecommended ApproachExpected Total Cost
First pilot project1ML/AI, not roboticsFull outsource to SVRC$2,500-8,000
Growing research lab3-5Robotics PhD studentsHybrid: buy hardware, outsource first task, DIY subsequent$15,000-30,000
Production deployment5-10Robotics engineering teamBuild in-house with SVRC consulting for pipeline setup$30,000-60,000
Enterprise scale10+Dedicated robotics teamBuild full in-house capability, SVRC as overflow capacity$60,000-150,000

The hybrid model (outsource the first task, build in-house for subsequent tasks) is the most common pattern among SVRC clients. The first outsourced project serves a dual purpose: it produces immediately usable training data, and it provides a reference implementation of the collection protocol, QA pipeline, and annotation workflow that the team can replicate internally for future projects.

Cost per Demo by Robot Platform

The robot hardware significantly affects collection throughput and therefore per-demo cost. Faster, more reliable hardware reduces operator time per demo. Here are typical throughput rates and resulting cost estimates across common platforms.

PlatformDemos/Hour (L2)Failure RateEffective Cost/DemoNotes
ALOHA (bimanual teleop)15-2515-25%$5-10Great for bimanual tasks; servo overheating limits long sessions
OpenArm 101 (SVRC leader/follower)20-3510-15%$4-8High throughput single-arm; $4,500 hardware cost
Franka Research 3 (VR teleop)10-188-12%$8-15High precision; slower teleop due to VR latency
UR5e (SpaceMouse teleop)12-205-10%$6-12Very reliable; SpaceMouse learning curve adds startup cost
DK1 bimanual (SVRC)12-2012-18%$7-12Bimanual with higher payload than ALOHA; good for dual-arm assembly

The OpenArm 101 achieves the lowest per-demo cost for single-arm tasks because of its high throughput leader-follower teleop interface and low hardware cost ($4,500). For bimanual tasks, ALOHA remains the most cost-effective platform despite servo reliability issues. SVRC maintains multiple platforms and routes each client project to the platform that minimizes per-demo cost for their specific task type.

Long-Term Data Asset Management

Robot demonstration data is a depreciating asset. Its value decreases over time as hardware changes, task specifications evolve, and better collection protocols produce higher-quality data. Effective data asset management extends the useful life of collected datasets.

  • Version all datasets with collection metadata. Every dataset should record: robot model and firmware version, camera model and calibration parameters, gripper type and configuration, operator ID, collection date, and protocol version. When a future policy fails, this metadata allows you to identify whether the training data is stale or incompatible.
  • Re-evaluate existing datasets quarterly. Train a fresh policy on your oldest dataset and evaluate on current hardware. If success rate has dropped more than 10% from the original evaluation, the dataset needs refreshing. Common causes: camera calibration drift, gripper wear changing grasp dynamics, or workspace layout changes.
  • Maintain a "golden set" of 50-100 high-quality demonstrations per task. These are your reference demonstrations: perfectly executed, diverse, and thoroughly annotated. Use them as the seed for every training run and as the evaluation benchmark for data quality. New collection batches should match or exceed the golden set's quality metrics.
  • Budget 10-20% of annual data costs for maintenance. Plan for periodic recollection to refresh datasets that have degraded, fill diversity gaps identified during deployment, and adapt to new object variants or task modifications. This maintenance budget is cheaper than full recollection and keeps your data assets production-ready.

How SVRC Reduces Cost

SVRC reduces demonstration cost through three mechanisms: (1) volume pricing -- sharing infrastructure across multiple clients reduces hardware and setup amortization, (2) trained operators -- SVRC operators complete tasks 30-50% faster than new operators with fewer failed attempts, reducing labor cost per successful demonstration, (3) shared infrastructure -- storage, QA tooling, and annotation pipelines are shared costs across clients rather than per-project overhead.

The result: SVRC data collection costs are typically 2-4x lower than in-house collection for labs without an established teleoperation infrastructure. For details on current pricing, see data services.

International Data Collection: Cost Considerations for Distributed Teams

Teams increasingly operate with distributed workforces, where robot hardware is in one location and operators or engineers are in others. This introduces specific cost factors.

  • Remote operator labor arbitrage. Skilled teleoperation operators in the US cost $25-45/hour. In regions with lower cost of living (Eastern Europe, Southeast Asia, India), equally skilled operators cost $8-20/hour. However, remote operators face higher latency (50-200ms additional RTT), which reduces throughput by 15-30% and increases the failure rate on precision tasks. The net cost savings are 20-40% for L1/L2 tasks (where latency impact is small) and near-zero for L3/L4 tasks (where latency severely degrades quality).
  • Shipping hardware internationally. Sending a complete ALOHA or OpenArm setup internationally costs $500-2,000 in shipping and takes 1-3 weeks for customs clearance. Robot arms containing lithium batteries or servos may require hazardous goods declarations. Consider SVRC's hardware leasing program as an alternative to international shipping for short-term projects.
  • Time zone advantages. With operators in 3 time zones (US Pacific, European, Asian), a single robot station can operate 18-24 hours per day rather than 8 hours. This triples data collection throughput without additional hardware. SVRC uses distributed operators for large-volume campaigns to compress collection timelines from weeks to days.
  • Data transfer and storage. A 500-episode dataset (30-60 GB) transfers in 2-4 hours on a 100 Mbps connection. For teams with slower internet or data sovereignty requirements, physical drive shipping (overnight express with an encrypted NVMe drive) is faster and more secure for datasets above 200 GB.

Related Reading

Imitation Learning Guide · Operator Fatigue and Ergonomics · Annotation Challenges · Scaling Laws · LeRobot Getting Started · Data Services