ML 20676 image 1

Accelerate Generative AI Inference on Amazon SageMaker AI with G7e Instances

As the demand for generative AI continues to grow, developers and enterprises seek more flexible, cost-effective, and powerful accelerators to meet their needs. Today, we are thrilled to announce the availability of G7e instances powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs on Amazon SageMaker AI. You can provision nodes with 1, 2, …

1tibjLPNWGd0PP43Y8Idjag

The Human Infrastructure: How Netflix Built the Operations Layer Behind Live at Scale

By: Brett Axler, Casper Choffat, and Alo Lowry In the three years since our first Live show, Chris Rock: Selective Outrage, we have witnessed an incredible expansion of our live content slate and the live operations that support it. From modest beginnings of streaming just one show per month, we are now capable of streaming over …

ML 20614 1

Introducing granular cost attribution for Amazon Bedrock

As AI inference grows into a significant share of cloud spend, understanding who and what are driving costs is essential for chargebacks, cost optimization, and financial planning. Today, we’re announcing granular cost attribution for Amazon Bedrock inference. Amazon Bedrock now automatically attributes inference costs to the IAM principal that made the call. An IAM principal …

MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining

This paper was accepted at the Workshop on Navigating and Addressing Data Problems for Foundation Models (NADPFM) at ICLR 2026. Principled domain reweighting can substantially improve sample efficiency and downstream generalization; however, data-mixture optimization for multimodal pretraining remains underexplored. Current multimodal training recipes tune mixtures from only a single perspective such as data format or …

ML 19982 image 1 1

Cost-efficient custom text-to-SQL using Amazon Nova Micro and Amazon Bedrock on-demand inference

Text-to-SQL generation remains a persistent challenge in enterprise AI applications, particularly when working with custom SQL dialects or domain-specific database schemas. While foundation models (FMs) demonstrate strong performance on standard SQL, achieving production-grade accuracy for specialized dialects requires fine-tuning. However, fine-tuning introduces an operational trade-off: hosting custom models on persistent infrastructure incurs continuous costs, even during …

1 ucWDyWKmax 1000x1000 1

How WPP accelerates humanoid robot training 10x with G4 VMs

Editor’s note: Today we hear from Perry Nightingale, SVP of Creative AI at WPP about the workflow that cuts training time for humanoid robots from days to minutes — plus access to the open-source code to do it yourself. Robots are pushing the boundaries of what content creators and directors can capture. These technologies have …

15Tcs5i77Z cgIGucACKLZw

Frontend Engineering at Palantir: Polar Scaled Tiles in Zodiac

About this Series Frontend engineering at Palantir goes far beyond building standard web apps. Our engineers design interfaces for mission-critical decision-making, build operational applications that translate insight to action, and create systems that handle massive datasets — thinking not just about what the user needs, but what they need when the network is unreliable, the stakes are high, …

ml 20785 image 1

Create rich, custom tooltips in Amazon Quick Sight

Amazon Quick Sight, the business intelligence (BI) capability of Amazon Quick, is a unified BI service. It provides modern interactive dashboards, natural language querying, pixel-perfect reports, machine learning (ML) insights, and embedded analytics at scale. Amazon Quick brings together AI agents for business insights, research, and automation in one integrated experience, helping you work smarter …

ML 18780 1

Navigating the generative AI journey: The Path-to-Value framework from AWS

Generative AI is reshaping how organizations approach productivity, customer experiences, and operational capabilities. Across industries, teams are experimenting with generative AI to unlock new ways of working. Many of these efforts produce compelling proofs of concept (POC) that demonstrate technical feasibility. The real challenge begins after those early wins. Although POCs frequently demonstrate technical feasibility, …