Learning Long-Term Motion Embeddings for Efficient Kinematics Generation
Understanding and predicting motion is a fundamental component of visual intelligence. Although modern video models exhibit strong comprehension of scene dynamics, exploring multiple possible futures through full video synthesis remains prohibitively inefficient. We model scene dynamics orders of magnitude more efficiently by directly operating on a long-term motion embedding that is learned from large-scale trajectories …
Read more “Learning Long-Term Motion Embeddings for Efficient Kinematics Generation”