Categories: FAANG

Rethinking JEPA: Compute-Efficient Video SSL with Frozen Teachers

Video Joint Embedding Predictive Architectures (V-JEPA) learn generalizable off-the-shelf video representation by predicting masked regions in latent space with an exponential moving average (EMA)-updated teacher. While EMA prevents representation collapse, it complicates scalable model selection and couples teacher and student architectures. We revisit masked-latent prediction and show that a frozen teacher suffices. Concretely, we (i) train a target encoder with a simple pixel-reconstruction objective under V-JEPA masking, then (ii) freeze it and train a student to predict the teacher’s…

This AI Model Can Intuit How the Physical World Works

The V-JEPA system uses ordinary videos to understand the physics of the real world.

December 7, 2025

In "AI/ML News"

Meta’s new world model lets robots manipulate objects in environments they’ve never encountered before

June 13, 2025

In "AI/ML News"

MAEEG: Masked Auto-encoder for EEG Representation Learning

This paper was accepted at the Workshop on Learning from Time Series for Health at NeurIPS 2022. Decoding information from bio-signals such as EEG, using machine learning has been a challenge due to the small data-sets and difficulty to obtain labels. We propose a reconstruction-based self-supervised learning model, the masked…

November 10, 2022

In "FAANG"

AI Generated Robotic Content