Policy (robot)

In robot learning, a policy (denoted π) is a function that maps observations to actions: π(o) → a. The policy is the learned "brain" of the robot that determines what to do at every timestep given what it perceives. Policies can be represented as neural networks (neural policies), decision trees, Gaussian processes, or lookup tables. They can be deterministic (one action per observation) or stochastic (a distribution over actions). Policy quality is measured by task success rate across diverse conditions, not just on training demonstrations. The core challenge of robot learning is training policies that generalize reliably beyond their training distribution.
Core ConceptDeep Learning

Explore More Terms

Browse the full robotics glossary with 70+ terms.

Back to Glossary