Entropy-Preserving Reinforcement Learning

Policy gradient algorithms have driven many recent advancements in language model reasoning. An appealing property is their ability to learn from exploration on their own trajectories, a process crucial for fostering diverse and creative solutions. As we show in this paper, many policy gradient algorithms naturally reduce the entropy—and thus the diversity of explored trajectories—as …

ML 19682 image 1

How Ring scales global customer support with Amazon Bedrock Knowledge Bases

This post is cowritten with David Kim, and Premjit Singh from Ring. Scaling self-service support globally presents challenges beyond translation. In this post, we show you how Ring, Amazon’s home security subsidiary, built a production-ready, multi-locale Retrieval-Augmented Generation (RAG)-based support chatbot using Amazon Bedrock Knowledge Bases. By eliminating per-Region infrastructure deployments, Ring reduced the cost …

Robots with different bodies can now share skills: What intention-based learning changes

Robots are increasingly being used in manufacturing, agriculture and health care. But programming a team of robots to carry out individual tasks raises a question: How can robots learn from other robots if they are built differently? A multi-institutional team including Chongjie Zhang, an associate professor of computer science and engineering at WashU McKelvey Engineering, …

AI benchmark helps robots plan and complete their chores in the real world

No matter how sophisticated they are, robots can often be indecisive and struggle with multi-step chores in the real world. For example, if you tell a robot to tidy a messy room, it might understand the goal but not know where to grab each object. It could even end up inventing steps. To address these …

[Update] ComfyUI VACE Video Joiner v2.5 – Seamless loops, reduced RAM usage on assembly

Github | CivitAI Point this workflow at a directory of clips and it will automatically stitch them together, fixing awkward motion and transition artifacts. At each seam, VACE generates new frames guided by context on both sides, replacing the seam with motion that flows naturally between the clips. How many context frames and generated frames …

Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

Existing feed-forward 3D Gaussian Splatting methods predict pixel-aligned primitives, leading to a quadratic growth in primitive count as resolution increases. This fundamentally limits their scalability, making high-resolution synthesis such as 4K intractable. We introduce LGTM (Less Gaussians, Texture More), a feed-forward framework that overcomes this resolution scaling barrier. By predicting compact Gaussian primitives coupled with …