Categories: FAANG

The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon

This paper was accepted to the “Has it Trained Yet?” (HITY) workshop at NeurIPS 2022.
The grokking phenomenon as reported by Power et al., refers to a regime where a long period of overfitting is followed by a seemingly sudden transition to perfect generalization. In this paper, we attempt to reveal the underpinnings of Grokking via a series of empirical studies. Specifically, we uncover an optimization anomaly plaguing adaptive optimizers at extremely late stages of training, referred to as the Slingshot Mechanism. A prominent artifact of the Slingshot Mechanism can be measured by the cyclic…
AI Generated Robotic Content

Recent Posts

NOAA Employees Told to Pause Work With ‘Foreign Nationals’

An internal email obtained by WIRED shows that NOAA workers received orders to pause “ALL…

1 min ago

A brain-inspired AI technology boosts efficiency and reduces energy consumption

Researchers at FORTH have developed a new type of artificial neural network (ANN) that incorporates…

1 min ago

Automated Feature Engineering in PyCaret

Automated feature engineering in

23 hours ago

Updating the Frontier Safety Framework

Our next iteration of the FSF sets out stronger security protocols on the path to…

23 hours ago

Adaptive Training Distributions with Scalable Online Bilevel Optimization

Large neural networks pretrained on web-scale corpora are central to modern machine learning. In this…

23 hours ago

Orchestrate seamless business systems integrations using Amazon Bedrock Agents

Generative AI has revolutionized technology through generating content and solving complex problems. To fully take…

23 hours ago