Introducing Impressions at Netflix

Part 1: Creating the Source of Truth for Impressions By: Tulika Bhatt Imagine scrolling through Netflix, where each movie poster or promotional banner competes for your attention. Every image you hover over isn’t just a visual placeholder; it’s a critical data point that fuels our sophisticated personalization engine. At Netflix, we call these images ‘impressions,’ and …

With MultiKueue, grab GPUs for your GKE cluster, wherever they may be

Artificial Intelligence (AI) and large language models (LLMs) are experiencing explosive growth, powering applications from machine translation to artistic creation. These technologies rely on intensive computations that require specialized hardware resources, like GPUs. But access to GPUs can be challenging, both in terms of availability and cost. For Google Cloud users, the introduction of Dynamic …

ARMOR: Egocentric Perception for Humanoid Robot Collision Avoidance and Motion Planning

Humanoid robots have significant gaps in their sensing and perception, making it hard to perform motion planning in dense environments. To address this, we introduce ARMOR, a novel egocentric perception system that integrates both hardware and software, specifically incorporating wearable-like depth sensors for humanoid robots. Our distributed perception approach enhances the robot’s spatial awareness, and …

ML 17894 image001

Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

AI agents continue to gain momentum, as businesses use the power of generative AI to reinvent customer experiences and automate complex workflows. We are seeing Amazon Bedrock Agents applied in investment research, insurance claims processing, root cause analysis, advertising campaigns, and much more. Agents use the reasoning capability of foundation models (FMs) to break down …

1 ai digital value chain.max 1000x1000 1

Operationalizing generative AI apps with Apigee

Generative AI is now well  beyond the hype and into the realm of practical application. But while organizations are eager to build enterprise-ready gen AI solutions on top of large language models (LLMs), they face challenges in managing, securing, and scaling these deployments, especially when it comes to APIs. As part of the platform team, …

Findings of the IWSLT 2024 Evaluation Campaign

Ibrahim Said Ahmad†, Antonios Anastasopoulos††††, Ondřej Bojar¶, Claudia Borg††, Marine Carpuat‡, Roldano Cattoni§, Mauro Cettolo§, William Chen‡‡, Qianqian Dong¶¶, Marcello Federico§§, Barry Haddow‡‡‡, Dávid Javorsky¶, Mateusz Krubiński¶, Tsz Kin Lam‡‡‡, Xutai Ma‡‡§, Prashant Mathur§§, Evgeny Matusov¶¶¶, Chandresh Kumar Maurya¶¶†, John P. McCrae†††, Kenton Murray†††, Satoshi Nakamura§§§, Matteo Negri§, Jan Niehues††¶, Xing Niu§§, Atul Kr. Ojha†††, …

ML 17494 image a 1

Fine-tune LLMs with synthetic data for context-based Q&A using Amazon Bedrock

There’s a growing demand from customers to incorporate generative AI into their businesses. Many use cases involve using pre-trained large language models (LLMs) through approaches like Retrieval Augmented Generation (RAG). However, for advanced, domain-specific tasks or those requiring specific formats, model customization techniques such as fine-tuning are sometimes necessary. Amazon Bedrock provides you with the …

ml 17926 meta sam 1 2

Meta SAM 2.1 is now available in Amazon SageMaker JumpStart

This blog post is co-written with George Orlin from Meta. Today, we are excited to announce that Meta’s Segment Anything Model (SAM) 2.1 vision segmentation model is publicly available through Amazon SageMaker JumpStart to deploy and run inference. Meta SAM 2.1 provides state-of-the-art video and image segmentation capabilities in a single model. This cutting-edge model …

Theory, Analysis, and Best Practices for Sigmoid Self-Attention

*Primary Contributors Attention is a key part of the transformer architecture. It is a sequence-to-sequence mapping that transforms each sequence element into a weighted sum of values. The weights are typically obtained as the softmax of dot products between keys and queries. Recent work has explored alternatives to softmax attention in transformers, such as ReLU …