Robot Kinematics for ML Engineers: The Essentials

Why ML Engineers Need Kinematics

When you train a robot learning policy, you must choose an action space: what numbers your network outputs and what they mean to the robot. That choice — joint angles vs. Cartesian deltas vs. end-effector poses — directly affects how generalizable your policy is, whether it can cause hardware damage, and how easy the learning problem is.

Even if you never write a kinematics solver, you will spend time debugging why your policy's outputs are causing the robot to move in unexpected ways. Understanding the geometry saves hours of confused staring at RViz.

Forward Kinematics

Forward kinematics (FK) answers: given the robot's joint angles, where is the end-effector?

Formally: input is the joint angle vector q ∈ ℝⁿ (one angle per joint); output is the end-effector pose T ∈ SE(3), a 4×4 homogeneous transform encoding both position (x, y, z) and orientation (3×3 rotation matrix).

FK is computed by multiplying a chain of per-joint transforms, each derived from the Denavit-Hartenberg (DH) parameters of the arm. This is a deterministic, fast, closed-form computation — there is exactly one answer for any given q. Most robot SDKs expose FK as a single function call.

Inverse Kinematics

Inverse kinematics (IK) answers the reverse question: given a desired end-effector pose, what joint angles achieve it?

This is harder for three reasons:

Non-uniqueness: Multiple joint configurations often reach the same Cartesian pose. A 7-DOF arm has infinitely many solutions for most poses.
Nonlinearity: The FK equations are nonlinear (they involve trigonometric functions of joint angles), so IK cannot be solved by simple matrix inversion.
No solution: Some desired poses lie outside the reachable workspace — IK must detect and handle this gracefully.

IK is solved either analytically (closed-form formula derived for specific arm geometries — fast but arm-specific) or numerically (iterative Jacobian-based methods — general purpose but slower and may converge to local minima). ROS2's KDL and IKFast solvers, and the Python ikpy library, handle this for you.

The Jacobian Matrix

The Jacobian matrix J(q) is perhaps the most practically important concept for ML engineers working on robot policies.

It describes how small changes in joint angles translate to small changes in end-effector Cartesian position and orientation:

ẋ = J(q) · q̇

where ẋ ∈ ℝ⁶ is the end-effector velocity (3 translational + 3 rotational) and q̇ ∈ ℝⁿ is the joint velocity vector. The Jacobian has shape 6×n.

In practice, the Jacobian appears in two contexts:

Servo-mode teleoperation: The operator sends a desired Cartesian velocity (e.g., "move 5 mm/s in the +x direction"). The controller uses the pseudoinverse of J to compute the required joint velocities: q̇ = J⁺ · ẋ.
Policy action spaces: Cartesian delta actions are implicitly using the Jacobian — the robot controller converts your (Δx, Δy, Δz) commands to joint motion internally.

Singularities

A singularity is a joint configuration where the Jacobian loses full rank — meaning the robot cannot move in some Cartesian directions regardless of how it moves its joints. Common singularities: fully extended arm (elbow straight), shoulder and wrist axes aligned.

In practice, singularities cause two problems: (1) IK solutions become numerically unstable near singularities, producing large sudden joint motions; (2) Jacobian pseudoinverse produces arbitrarily large joint velocities as the robot approaches singularity.

Rule of thumb: Keep at least 30° away from known singular configurations in your workspace design and policy training environments.

Action Space Choices for Robot Learning

Action Space	Intuitive?	Generalizable?	Needs IK?	Typical Use
Absolute joint angles (q)	No	Low (arm-specific)	No	ALOHA, ACT baseline
Absolute Cartesian pose (x,y,z,R)	Yes	Moderate	Yes	Manipulation research
Cartesian delta (Δx,Δy,Δz,Δ rotation)	Yes	High	Yes (incremental)	Most modern IL policies
Relative to object pose	Yes	Very high	Yes	Generalizable grasping
Waypoints (sparse)	Yes	High	Yes	Long-horizon tasks

For most ML practitioners starting out: use Cartesian delta actions. They are intuitive (the network outputs "move left 2 mm"), they generalize better than absolute joint angles (not tied to a specific arm configuration), and modern robot controllers handle the IK internally. Switch to joint-space actions only when you need maximum speed or are working with arms that have poor IK solvers.

To learn more about arm hardware and how to get started, see the OpenArm documentation.