Markov Decision Process

A mathematical framework for sequential decision making under uncertainty: a tuple (S, A, P, R, γ) of states, actions, transition probabilities, reward function, and discount factor. RL algorithms learn policies that maximize expected discounted return in an MDP. Robot manipulation and locomotion are modeled as MDPs, where the policy maps states to actions.

MathRL

Explore More Terms

Browse 1,000+ robotics terms.

Back to Glossary