architecture diagram

Building an AIOps chatbot with Amazon Q Business custom plugins

Many organizations rely on multiple third-party applications and services for different aspects of their operations, such as scheduling, HR management, financial data, customer relationship management (CRM) systems, and more. However, these systems often exist in silos, requiring users to manually navigate different interfaces, switch between environments, and perform repetitive tasks, which can be time-consuming and …

Image 1 obVbDjHmax 1000x1000 1

Next 25 developer keynote: From prompt, to agent, to work, to fun

Attending a tech conference like Google Cloud Next can feel like drinking from a firehose — all the news, all the sessions, and breakouts, all the learning and networking… But after a busy couple of days, watching the developer keynote makes it seem like there’s a method to the madness. A coherent picture starts to …

A new robotic gripper made of measuring tape is sizing up fruit and veggie picking

It’s a game a lot of us played as children — and maybe even later in life: unspooling measuring tape to see how far it would extend before bending. But to engineer, this game was an inspiration, suggesting that measuring tape could become a great material for a robotic gripper. The grippers would be a …

MM-Ego: Towards Building Egocentric Multimodal LLMs

This research aims to comprehensively explore building a multimodal foundation model for egocentric video understanding. To achieve this goal, we work on three fronts. First, as there is a lack of QA data for egocentric video understanding, we automatically generate 7M high-quality QA samples for egocentric videos ranging from 30 seconds to one hour long …

ML 17713 image 1

Reduce ML training costs with Amazon SageMaker HyperPod

Training a frontier model is highly compute-intensive, requiring a distributed system of hundreds, or thousands, of accelerated instances running for several weeks or months to complete a single job. For example, pre-training the Llama 3 70B model with 15 trillion training tokens took 6.5 million H100 GPU hours. On 256 Amazon EC2 P5 instances (p5.48xlarge, …

04 Graphs Introducing GKE Optimized Infmax 1000x1000 1

New GKE inference capabilities reduce costs, tail latency and increase throughput

When it comes to AI, inference is where today’s generative AI models can solve real-world business problems. Google Kubernetes Engine (GKE) is seeing increasing adoption of gen AI inference. For example, customers like HubX run inference of image-based models to serve over 250k images/day to power gen AI experiences, and Snap runs AI inference on …