badgephotoscorp amazon 100x133 1

Securing Amazon Bedrock cross-Region inference: Geographic and global

The adoption and implementation of generative AI inference has increased with organizations building more operational workloads that use AI capabilities in production at scale. To help customers achieve the scale of their generative AI applications, Amazon Bedrock offers cross-Region inference (CRIS) profiles, a powerful feature organizations can use to seamlessly distribute inference processing across multiple …

A gRPC transport for the Model Context Protocol

AI agents are moving from test environments to the core of enterprise operations, where they must interact reliably with external tools and systems to execute complex, multi-step goals. The Model Context Protocol (MCP) is the standard that makes this agent to tool communication possible. In fact, just last month we announced the release of fully-managed, …

Over-Searching in Search-Augmented Large Language Models

Search-augmented large language models (LLMs) excel at knowledge-intensive tasks by integrating external retrieval. However, they often over-search – unnecessarily invoking search tool even when it does not improve response quality, which leads to computational inefficiency and hallucinations by incorporating irrelevant context. In this work, we conduct a systematic evaluation of over-searching across multiple dimensions, including …

image001

How Omada Health scaled patient care by fine-tuning Llama models on Amazon SageMaker AI

This post is co-written with Sunaina Kavi, AI/ML Product Manager at Omada Health. Omada Health, a longtime innovator in virtual healthcare delivery, launched a new nutrition experience in 2025, featuring OmadaSpark, an AI agent trained with robust clinical input that delivers real-time motivational interviewing and nutrition education. It was built on AWS. OmadaSpark was designed …

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

Unified multimodal Large Language Models (LLMs) that can both understand and generate visual content hold immense potential. However, existing open-source models often suffer from a performance trade-off between these capabilities. We present Manzano, a simple and scalable unified framework that substantially reduces this tension by coupling a hybrid image tokenizer with a well-curated training recipe. …

AdaBoN: Adaptive Best-of-N Alignment

Recent advances in test-time alignment methods, such as Best-of-N sampling, offer a simple and effective way to steer language models (LMs) toward preferred behaviors using reward models (RM). However, these approaches can be computationally expensive, especially when applied uniformly across prompts without accounting for differences in alignment difficulty. In this work, we propose a prompt-adaptive …

ML 20065 image 1

Crossmodal search with Amazon Nova Multimodal Embeddings

Amazon Nova Multimodal Embeddings processes text, documents, images, video, and audio through a single model architecture. Available through Amazon Bedrock, the model converts different input modalities into numerical embeddings within the same vector space, supporting direct similarity calculations regardless of content type. We developed this unified model to reduce the need for separate embedding models, …

ML 18088 image 1 1

Scaling medical content review at Flo Health using Amazon Bedrock (Part 1)

This blog post is based on work co-developed with Flo Health. Healthcare science is rapidly advancing. Maintaining accurate and up-to-date medical content directly impacts people’s lives, health decisions, and well-being. When someone searches for health information, they are often at their most vulnerable, making accuracy not just important, but potentially life-saving. Flo Health creates thousands …

Publikacja fałszywych twierdzeń nt. Palantira i Rządu Szwajcarskiego przez czasopismo Die Republik

Sprostowanie: Publikacja fałszywych twierdzeń nt. Palantira i Rządu Szwajcarskiego przez czasopismo ‘Die Republik’ Wprowadzenie Artykuł opublikowany w grudniu 2025 r. w Republik zwraca uwagę na raport Sztabu Sił Zbrojnych Szwajcarii (Armeestab) z 2024 r., w którym oceniono możliwość wdrożenia oprogramowania opracowanego przez firmę Palantir. Artykuł przedstawia sytuację w sposób fałszywy i wprowadzający w błąd, zgodnie …