The Contact Problem
State-of-the-art RGB-D perception systems achieve 2–5mm localization accuracy on objects in a scene. This sounds impressive until you consider what precision assembly requires: peg-in-hole insertion tolerances are typically ±0.5mm, USB-A connector insertion is ±0.3mm, and SIM card slots require ±0.1mm. The gap between what vision can resolve and what contact tasks require is a physical constant — no amount of better camera resolution closes it entirely.
The core issue is that contact itself changes the system state in ways that are invisible to a camera. When a gripper finger makes contact with a surface, forces and deformations occur at a sub-millimeter scale. The camera sees a gripper approaching an object; it cannot see whether the contact is stable, slipping, or about to cause the object to rotate out of the grasp.
Contact Mechanics Fundamentals
Three concepts from contact mechanics are essential for understanding why force sensing matters:
- Friction Cone: A contact force must lie within the friction cone to avoid slip. The cone half-angle is arctan(μ), where μ is the coefficient of friction. For rubber gripper pads on steel (common in industrial settings), μ ≈ 0.3. For silicone on glass, μ ≈ 0.5. A measured contact force outside the friction cone predicts imminent slip — information that no camera can provide.
- Force Closure vs. Form Closure: A form closure grasp constrains the object by geometry alone (e.g., gripping a cylinder in a matched V-groove). A force closure grasp constrains the object by applying contact forces from multiple directions such that their friction cones together prevent any rigid body motion. Force sensors let you verify you've achieved force closure; vision cannot.
- Coulomb Friction Model: The standard model for predicting slip: |F_tangential| ≤ μ × F_normal. Measuring F_normal at the contact point lets you compute the maximum tangential force before slip, which is the fundamental constraint in dexterous manipulation.
Where Vision Fails in Practice
Three specific failure modes dominate in real manipulation systems:
- Occlusion During Approach: As the gripper approaches an object for grasping, the fingers occlude the contact surface. The camera's view of exactly where contact will occur disappears in the last 2–5cm of approach — the most critical phase for precise placement.
- Pixel Noise at Close Range: Depth sensors lose accuracy below 10–15cm working distance due to structured light interference and IR reflections from the gripper itself. Wrist-mounted cameras operating at typical manipulation distances are operating at the edge of their reliable range.
- Depth Shadow on Grasp Surface: The gripper structure casts depth shadows on the grasp target surface, creating missing data in the point cloud precisely where you need the best localization.
Force Sensing Approaches and Tradeoffs
| Approach | What It Measures | Precision | Cost | Best For |
|---|---|---|---|---|
| Wrist 6-axis F/T (e.g., ATI Mini45) | Full contact wrench at wrist | ±0.05N / ±0.5Nmm | $3K–8K | Insertion tasks, assembly, grinding |
| Joint torque estimation | Estimated external torques via dynamics | ±0.5Nm | $0 (software) | Collision detection, rough contact sensing |
| Fingertip F/T (e.g., Bota SensONE) | Force at fingertip | ±0.01N | $2K–5K per finger | Delicate grasping, surface following |
| Tactile array (e.g., XELA uSkin) | Spatial pressure distribution | 3mm spatial resolution | $800–3K | Slip detection, contact localization |
Performance Impact: The Numbers
The performance difference between force-enabled and vision-only manipulation is large and well-documented in robotics literature. For peg-in-hole insertion with 0.5mm clearance: force/torque-guided approaches achieve 89% success rate vs. 62% for vision-only, based on ATI-sponsored benchmarks on a Franka Research 3 arm. The delta is larger for tighter tolerances.
Cable insertion — arguably the most economically important manipulation task for electronics manufacturing — shows a similar gap: 94% success with wrist F/T control vs. 71% vision-only. The difference compounds in production: a 71% success rate requires 1.4 attempts per insertion on average; at 94%, it's 1.06. At 10,000 insertions per day, this translates directly to throughput capacity.
For manipulation tasks where tolerances exceed 5mm and compliance is not required, vision-only approaches are often sufficient. But for anything involving constrained fit, sliding contact, or delicate force limits — force sensing is not optional, it's the enabling technology.
SVRC stocks a range of force-enabled manipulation hardware including wrist F/T sensors and tactile arrays. Browse the hardware catalog for current availability and integration support.