SPD: Sync-Point Drop for Efficient Tensor Parallelism of Large Language Models

With the rapid expansion in the scale of large language models (LLMs), enabling efficient distributed inference across multiple computing units has become increasingly critical. However, communication overheads from popular distributed inference techniques such as Tensor Parallelism pose a significant challenge to achieve scalability and low latency. Therefore, we introduce a novel optimization technique, Sync-Point Drop …

Picture1 3

Optimize query responses with user feedback using Amazon Bedrock embedding and few-shot prompting

Improving response quality for user queries is essential for AI-driven applications, especially those focusing on user satisfaction. For example, an HR chat-based assistant should strictly follow company policies and respond using a certain tone. A deviation from that can be corrected by feedback from users. This post demonstrates how Amazon Bedrock, combined with a user …

Announcing Anthropic’s Claude Opus 4 and Claude Sonnet 4 on Vertex AI

Today, we’re expanding the choice of third-party models available in Vertex AI Model Garden with the addition of Anthropic’s newest generation of the Claude model family: Claude Opus 4 and Claude Sonnet 4. Both Claude Opus 4 and Claude Sonnet 4 are hybrid reasoning models, meaning they offer modes for near-instant responses and extended thinking …

Humanoid Policy ~ Human Policy

Training manipulation policies for humanoid robots with diverse data enhances their robustness and generalization across tasks and platforms. However, learning solely from robot demonstrations is labor-intensive, requiring expensive tele-operated data collection which is difficult to scale. This paper investigates a more scalable data source, egocentric human demonstrations, to serve as cross-embodiment training data for robot …

FM-Intent: Predicting User Session Intent with Hierarchical Multi-Task Learning

Authors: Sejoon Oh, Moumita Bhattacharya, Yesu Feng, Sudarshan Lamkhede, Ko-Jen Hsiao, and Justin Basilico Motivation Recommender systems have become essential components of digital services across e-commerce, streaming media, and social networks [1, 2]. At Netflix, these systems drive significant product and business impact by connecting members with relevant content at the right time [3, 4]. While …

1 Solutions Overview Slack Integration with Amazon Bedrock Agents

Integrate Amazon Bedrock Agents with Slack

As companies increasingly adopt generative AI applications, AI agents capable of delivering tangible business value have emerged as a crucial component. In this context, integrating custom-built AI agents within chat services such as Slack can be transformative, providing businesses with seamless access to AI assistants powered by sophisticated foundation models (FMs). After an AI agent …

1 Weuc1kJmax 1000x1000 1

Introducing AI.GENERATE_TABLE: creating structured data from gen AI models in BigQuery

The explosion of digital content from social media, smartphones, and other sources has created a massive amount of unstructured data like images, videos, and documents. To help you analyze this data, BigQuery is connected with Vertex AI, Google Cloud’s powerful AI platform, so you can use advanced AI models, like Gemini 2.5 Pro/Flash, to understand …

ML 18173 end to end 1

Build a domain‐aware data preprocessing pipeline: A multi‐agent collaboration approach

Enterprises—especially in the insurance industry—face increasing challenges in processing vast amounts of unstructured data from diverse formats, including PDFs, spreadsheets, images, videos, and audio files. These might include claims document packages, crash event videos, chat transcripts, or policy documents. All contain critical information across the claims processing lifecycle. Traditional data preprocessing methods, though functional, might …

1 FnBCxdGmax 1000x1000 1

Expanding Vertex AI with the next wave of generative AI media models

Today, we are introducing the next wave of generative AI media models on Vertex AI: Imagen 4, Veo 3, and Lyria 2.  We’ve already seen customers generate stunning, photorealistic images with Imagen 3, Google’s image generation model. Customers have taken these images and transformed them into high quality videos and assets with Veo 2. We’ve …