Scaling Laws for Optimal Data Mixtures

Large foundation models are typically trained on data from multiple domains, with the data mixture—the proportion of each domain used—playing a critical role in model performance. The standard approach to selecting this mixture relies on trial and error, which becomes impractical for large-scale pretraining. We propose a systematic method to determine the optimal data mixture …

1tfnrbP7oD r9iEesLhACpA

Building a Resilient Data Platform with Write-Ahead Log at Netflix

By Prudhviraj Karumanchi, Samuel Fu, Sriram Rangarajan, Vidhya Arvind, Yun Wang, John Lu Introduction Netflix operates at a massive scale, serving hundreds of millions of users with diverse content and features. Behind the scenes, ensuring data consistency, reliability, and efficient operations across various services presents a continuous challenge. At the heart of many critical functions lies …

2122

Building health care agents using Amazon Bedrock AgentCore

This blog was co-authored with Kuldeep Singh, Head of AI Platform at Innovaccer. The integration of agentic AI is ushering in a transformative era in health care, marking a significant departure from traditional AI systems. Agentic AI demonstrates autonomous decision-making capabilities and adaptive learning in complex medical environments, enabling it to monitor patient progress, coordinate …

Lightweight framework enables faster, more accurate object detection for UAV remote sensing

Remote sensing object detection is a rapidly growing field in artificial intelligence, playing a critical role in advancing the use of unmanned aerial vehicles (UAVs) for real-world applications such as disaster response, urban planning, and environmental monitoring. Yet, designing models that balance both high accuracy and fast, lightweight performance remains a challenge.

WAN2.5-Preview: They are collecting feedback to fine-tune this PREVIEW. The full release will have open training + inference code. The weights MAY be released, but not decided yet. WAN2.5 demands SIGNIFICANTLY more VRAM due to being 1080p and 10 seconds. Final system requirements unknown! (@50:57)

This post summarizes a very important livestream with a WAN engineer. It will at least be partially open (model architecture, training code and inference code). Maybe even fully open weights if the community treats them with respect and gratitude, which is also what one of their engineers basically spelled out on Twitter a few days …

Leveraging Audio-Visual Data to Reduce the Multilingual Gap in Self-Supervised Speech Models

Self-supervised learning (SSL) has made significant advances in speech representation learning. Models like wav2vec 2.0 and HuBERT have achieved state-of-the-art results in tasks such as speech recognition, particularly in monolingual settings. However, multilingual SSL models tend to underperform their monolingual counterparts on each individual language, especially in multilingual scenarios with few languages such as the …

1LTVdJSQtOYUmLDS VT7y7w

How Palantir’s Strategic Privacy Investments Enable Future Customer Success

Building for Tomorrow Introduction Palantir’s customers use our software platforms for their most critical challenges — from delivering vaccines to enabling force readiness to building resilient supply chains — and these challenges often require bringing together data with unique sensitivities from a variety of source systems. This is why we have spent the past 20-plus years building tools for security, …