MediaFM: The Multimodal AI Foundation for Media Understanding at Netflix

Avneesh Saluja, Santiago Castro, Bowei Yan, Ashish Rastogi Introduction Netflix’s core mission is to connect millions of members around the world with stories they’ll love. This requires not just an incredible catalog, but also a deep, machine-level understanding of every piece of content in that catalog, from the biggest blockbusters to the most niche documentaries. As …

ML 2046

Scaling data annotation using vision-language models to power physical AI systems

Critical labor shortages are constraining growth across manufacturing, logistics, construction, and agriculture. The problem is particularly acute in construction: nearly 500,000 positions remain unfilled in the United States, with 40% of the current workforce approaching retirement within the decade. These workforce limitations result in delayed projects, escalating costs, and deferred development plans. To address these …

Researchers pioneer next-generation AI semiconductors with ‘thermal constraining’ technique

A research team led by Professor Taesung Kim from the School of Mechanical Engineering at Sungkyunkwan University has developed a technology that precisely controls the internal structure of semiconductors using heat, much like stamping out “bungeoppang” (fish-shaped pastry) in a mold. The team report that this approach improves the performance of next-generation artificial intelligence (AI) …

3 Months later – Proof of concept for making comics with Krita AI and other AI tools

Some folks might remember this post I made a few short months ago where I explored the possibility of making comics with SDXL and Krita AI. I had no clue what I was doing when I started, so it was entirely an experiment to figure out could you make comics with these tools. The short …

Jailbreaking the matrix: How researchers are bypassing AI guardrails to make them safer

A paper written by University of Florida Computer & Information Science & Engineering, or CISE, Professor Sumit Kumar Jha, Ph.D., contains so many science fiction terms, you’d be forgiven for thinking it’s a Hollywood script: Nullspace steering. Red teaming. Jailbreaking the matrix. But Jha’s work is decidedly focused on real life, most notably strengthening the …

Turns out LTX-2 makes a very good video upscaler for WAN

I have had a lot of fun with LTX but for a lot of usecases it is useless for me. for example this usecase where I could not get anything proper with LTX no matter how much I tried (mild nudity): https://aurelm.com/portfolio/ode-to-the-female-form/ The video may be choppy on the site but you can download it …

AI model edits can leak sensitive data via update ‘fingerprints’

Artificial intelligence (AI) systems are now widely used by millions of people worldwide, as tools to source information or tackle specific tasks more rapidly and efficiently. Today, some of the most used are large language models (LLMs), computational models trained on large collections of texts that can process and generate written content in various languages.