image 36

NVIDIA Nemotron 3 Ultra now available on Amazon SageMaker JumpStart

Today, we are excited to announce the day-zero availability of NVIDIA Nemotron 3 Ultra on Amazon SageMaker JumpStart. With this launch, you can now deploy the Nemotron 3 Ultra model using a one-click deployment experience. Nemotron 3 Ultra is an open model built for frontier reasoning and orchestration in long-running autonomous agents, delivering 5x faster …

2 vPfgVT7max 1000x1000 1

What’s new for Managed Service for Apache Spark clusters

At Google Cloud, our goal is to let you run large-scale analytical and data science workloads with maximum efficiency so you can process big data pipelines, machine learning, and ETL tasks.  We recently announced that the Dataproc service is now Managed Service for Apache Spark, reflecting our deep integration with the Agentic Data Cloud. To …

ML 20534 1

How to build self-driving AI operations on Amazon Bedrock at scale

Amazon Bedrock powers generative AI for more than 100,000 organizations worldwide—from startups to global enterprises across every industry. It provides the proven infrastructure and comprehensive capabilities to confidently build applications and agents that work in production with the flexibility, enterprise security, and proven scalability you need to innovate boldly and deliver AI that drives real …

10l0prhKlcOjf d3 JVQTJg

Dynamically Splitting Wide Partitions in Cassandra for Time Series Workloads

By Rajiv Shringi, Kaidan Fullerton, Oleksii Tkachuk and Kartik Sathyanarayanan Introduction Netflix’s TimeSeries Abstraction is a scalable system for ingesting and querying petabytes of temporal event data with millisecond latency. We use Apache Cassandra 4.x as the underlying storage for these main reasons: Throughput, latency, and cost: Cassandra can handle millions of low‑latency reads and writes …

ML 20384 1

The art and science of hyperparameter optimization on Amazon Nova Forge

Large language models (LLMs) deliver strong results on general tasks, but they often struggle with specialized work that requires understanding proprietary data, internal processes, or domain-specific terminology. Amazon Nova Forge addresses this by enabling you to build your own frontier models using Amazon Nova. You can start development from early model checkpoints, blend proprietary data …

ML 21022 1

Reference your own AWS Secrets Manager secrets in Amazon Bedrock AgentCore Identity

AI agents are only as powerful as the tools they can access. Whether retrieving customer data from a CRM, posting updates to Slack, or querying a GitHub repository, agents need to call external APIs, and that means securely passing credentials at runtime. Getting that right, without hardcoding secrets in code or exposing them in agent …

2 Architecturemax 1000x1000 1

How Trustpilot built a real-time architecture for data enrichment using Gemma

Processing millions of user reviews in real-time, under strict latency and cost constraints, is no easy task. Trustpilot has been doing exactly that with custom machine learning since long before large language models (LLMs) were cool. Now, as the company transitions its core stack to generative AI, here is a look at how we teamed …

174bhmTRm27PKE4K2q RMXQ

Enterprise Business Software and the Mixed-Up Chameleon Problem

Editor’s Note: This blog post was written by Greg Little, Senior Counselor at Palantir, with Aaron Jaffe, Senior Vice President at Palantir. Over 10 years of implementing Enterprise Resource Planning (ERP) systems, I remember one project where the CFO stopped the room cold. It was 11:30 at night during a mock cutover. People were exhausted …

16XXQ786vx4AwImVynUAU6A

High-Throughput Graph Abstraction at Netflix: Part I

By Oleksii Tkachuk, Kartik Sathyanarayanan, Rajiv Shringi Introduction Netflix has a diverse range of graph use cases, each serving specific business needs with unique functionality and performance requirements. These use cases fall into two broad categories: OLAP: These use cases typically involve open-ended and algorithmic exploration of large graph datasets. They often utilize industry-standard models and …

ML 21002 1 1

Comprehensive observability for Amazon SageMaker AI LLM inference: From GPU utilization to LLM quality

Deploying large language models (LLMs) at scale on Amazon SageMaker AI Inference makes observability a critical pillar of any production machine learning (ML) strategy. Unlike conventional software that returns deterministic outputs, LLMs generate variable, free-form responses that are difficult to validate with standard metrics. LLM output quality can change over time as input distributions shift, …