Cosine Annealing
A learning rate schedule that decreases the learning rate following a cosine curve from an initial value to near zero over the training period. Cosine annealing with warm restarts (SGDR) periodically resets the learning rate, which can help escape local minima. It is the default schedule for training vision transformers and many robot learning models.