Glossary

Stochastic Policy

A policy that outputs a probability distribution over actions (rather than a single deterministic action), enabling exploration and handling of aleatoric uncertainty. Gaussian policies (mean + diagonal covariance) are standard in policy gradient RL. Diffusion policies and normalizing flow policies extend to more complex action distributions. Stochastic policies are necessary for maximum entropy RL (SAC).

See this in practice: our real-world evals →

Robot LearningRLPolicy

Explore More Terms

Browse 1,000+ robotics terms.

Back to Glossary