Categories: FAANG

The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon

This paper was accepted to the “Has it Trained Yet?” (HITY) workshop at NeurIPS 2022.
The grokking phenomenon as reported by Power et al., refers to a regime where a long period of overfitting is followed by a seemingly sudden transition to perfect generalization. In this paper, we attempt to reveal the underpinnings of Grokking via a series of empirical studies. Specifically, we uncover an optimization anomaly plaguing adaptive optimizers at extremely late stages of training, referred to as the Slingshot Mechanism. A prominent artifact of the Slingshot Mechanism can be measured by the cyclic…
AI Generated Robotic Content

Recent Posts

From fear to fluency: Why empathy is the missing ingredient in AI rollouts

Empathy and trust are not optional. They are essential for scaling change and encouraging innovation,…

34 mins ago

What Satellite Images Reveal About the US Bombing of Iran’s Nuclear Sites

The US concentrated its attack on Fordow, an enrichment plant built hundreds of feet underground.…

34 mins ago

Half of today’s jobs could vanish—Here’s how smart countries are future-proofing workers

AI is revolutionizing the job landscape, prompting nations worldwide to prepare their workforces for dramatic…

34 mins ago

Spline Path Control v2 – Control the motion of anything without extra prompting! Free and Open Source

Here's v2 of a project I started a few days ago. This will probably be…

24 hours ago

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

We present STARFlow, a scalable generative model based on normalizing flows that achieves strong performance…

24 hours ago

Cloud quantum computing: A trillion-dollar opportunity with dangerous hidden risks

GUEST: Quantum computing (QC) brings with it a mix of groundbreaking possibilities and significant risks.…

1 day ago