Behavioral Regularization

A family of offline RL methods that constrain the learned policy to remain close to the behavioral policy that collected the data, preventing exploitation of out-of-distribution actions. Methods include: policy constraint (TD3+BC), KL divergence penalty, and support constraint (BEAR). Behavioral regularization is the key mechanism enabling stable offline RL.

Robot LearningRL

Explore More Terms

Browse 1,000+ robotics terms.

Back to Glossary