Categories: FAANG

The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon

This paper was accepted to the “Has it Trained Yet?” (HITY) workshop at NeurIPS 2022.
The grokking phenomenon as reported by Power et al., refers to a regime where a long period of overfitting is followed by a seemingly sudden transition to perfect generalization. In this paper, we attempt to reveal the underpinnings of Grokking via a series of empirical studies. Specifically, we uncover an optimization anomaly plaguing adaptive optimizers at extremely late stages of training, referred to as the Slingshot Mechanism. A prominent artifact of the Slingshot Mechanism can be measured by the cyclic…
AI Generated Robotic Content

Recent Posts

Anima – Sharing Some Prompts and Results

Been experimenting with Anima lately and ended up spending way too much time refining prompts.…

22 hours ago

Keychron K2 HE Concrete Edition Review: Rock-Solid Typing

Keychron's K2 HE Concrete Edition sounds like a cute gimmick, but as I discovered, there's…

23 hours ago

AI generates full battery electrolyte recipes, matching top lithium metal battery performance

Battery electrolytes aren't just one chemical, but a complex mixture of salts, solvents, and additives…

23 hours ago

Nava – A 6.3B audio-video model .

Page: https://ernie-research.github.io/NAVA/ Model: https://huggingface.co/ernie-research/NAVA Github: https://github.com/ernie-research/NAVA NAVA is a 6.3 B-parameter joint audio-video generator that…

2 days ago

Enterprise Business Software and the Mixed-Up Chameleon Problem

Editor’s Note: This blog post was written by Greg Little, Senior Counselor at Palantir, with…

2 days ago

High-Throughput Graph Abstraction at Netflix: Part I

By Oleksii Tkachuk, Kartik Sathyanarayanan, Rajiv ShringiIntroductionNetflix has a diverse range of graph use cases, each…

2 days ago