Categories: FAANG

The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon

This paper was accepted to the “Has it Trained Yet?” (HITY) workshop at NeurIPS 2022.
The grokking phenomenon as reported by Power et al., refers to a regime where a long period of overfitting is followed by a seemingly sudden transition to perfect generalization. In this paper, we attempt to reveal the underpinnings of Grokking via a series of empirical studies. Specifically, we uncover an optimization anomaly plaguing adaptive optimizers at extremely late stages of training, referred to as the Slingshot Mechanism. A prominent artifact of the Slingshot Mechanism can be measured by the cyclic…
AI Generated Robotic Content

Recent Posts

Wan LoRa that creates hyper-realistic people just got an update

The Instagirl Wan LoRa was just updated to v2.3. It was retrained to be better…

17 hours ago

Vibe Coding is Shoot-and-Forget Coding

TL;DR Vibe coding is great for quick hacks; lasting software still needs real engineers. Vibe…

17 hours ago

Scaling On-Prem Security at Palantir

How Insight, Foundry & Apollo Keep Thousands of Servers in CheckIntroductionWhen it comes to Palantir’s on-premises…

17 hours ago

Introducing Amazon Bedrock AgentCore Gateway: Transforming enterprise AI agent tool development

To fulfill their tasks, AI Agents need access to various capabilities including tools, data stores,…

17 hours ago

This researcher turned OpenAI’s open weights model gpt-oss-20b into a non-reasoning ‘base’ model with less alignment, more freedom

Morris found it could also reproduce verbatim passages from copyrighted works, including three out of…

18 hours ago

9 Best Pillows (2025) Tested For Side, Back, and Stomach Sleepers

We’ve spent over a year testing the best pillows to support your noggin, whether you…

18 hours ago