Categories: FAANG

How to Scale Your EMA

*=Equal Contributors
Preserving training dynamics across batch sizes is an important tool for practical machine learning as it enables the trade-off between batch size and wall-clock time. This trade-off is typically enabled by a scaling rule; for example, in stochastic gradient descent, one should scale the learning rate linearly with the batch size. Another important machine learning tool is the model EMA, a functional copy of a target model whose parameters move towards those of its target model according to an Exponential Moving Average (EMA) at a rate parameterized by a momentum…
AI Generated Robotic Content

Recent Posts

What model did they use here?

I’ve been seeing this TikTok account a lot where they make mini vlogs as if…

10 hours ago

AI benchmark helps robots plan and complete their chores in the real world

No matter how sophisticated they are, robots can often be indecisive and struggle with multi-step…

11 hours ago

[Update] ComfyUI VACE Video Joiner v2.5 – Seamless loops, reduced RAM usage on assembly

Github | CivitAI Point this workflow at a directory of clips and it will automatically…

1 day ago

Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

Existing feed-forward 3D Gaussian Splatting methods predict pixel-aligned primitives, leading to a quadratic growth in…

1 day ago

What Is the Best Garmin Watch Right Now? (2026)

We tested Garmin’s GPS-enabled fitness trackers and found the perfect picks for casual hikers, backcountry…

1 day ago

Human creativity still resists automation: Artists rank highest, with unguided AI coming in last

New research confirms it: the creativity of artificial intelligence (AI) is a myth. Although current…

1 day ago