Glossary

DDPG

Deep Deterministic Policy Gradient — an off-policy RL algorithm for continuous action spaces that uses a deterministic policy and a Q-function, trained via the deterministic policy gradient theorem. DDPG uses experience replay and target networks for stability. It has been widely applied to robot control tasks but can be sensitive to hyperparameters. TD3 (Twin Delayed DDPG) addresses its overestimation bias.

See this in practice: our real-world evals →

Explore More Terms

Browse the full robotics glossary.

Back to Glossary