Definition
Reward shaping adds supplementary reward terms to the base sparse reward (e.g., task success/failure) to provide denser learning signals. In robotic manipulation, shaped rewards might include distance to target, progress toward a subgoal, or contact establishment. Good reward shaping dramatically accelerates learning but risks creating reward hacking — behaviors that maximize the shaped reward without achieving the true objective. Potential-based reward shaping preserves the optimal policy guarantees of the original MDP. Automatic reward design using language models is an active research direction.
Why It Matters for Robot Teams
Understanding reward shaping is essential for teams building real-world robot systems. Whether you are collecting demonstration data, training policies in simulation, or deploying in production, this concept directly affects your workflow and system design.