MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs

We introduce MIA-Bench, a new benchmark designed to evaluate multimodal large language models (MLLMs) on their ability to strictly adhere to complex instructions. Our benchmark comprises a diverse set of 400 image-prompt pairs, each crafted to challenge the models’ compliance with layered instructions in generating accurate responses that satisfy specific requested patterns. Evaluation results from …

12AEcNQT26foi2TaaYQen3OSg

Empowering the Warfighter: Palantir’s Partnership with Microsoft

Promoting Army readiness through seamless coordination between Palantir-powered Army Vantage platform and Microsoft Power BI Better Together As the Department of Defense (DoD) increasingly relies on software and data to drive mission readiness and operations, the need for cutting-edge, interoperable technology solutions has never been more critical. Data interoperability should be the cornerstone for informed decision-making …

ML 17004 neighbors 1

How IDIADA optimized its intelligent chatbot with Amazon Bedrock

This post is co-written with Xavier Vizcaino, Diego Martín Montoro, and Jordi Sánchez Ferrer from Applus+ Idiada. In 2021, Applus+ IDIADA, a global partner to the automotive industry with over 30 years of experience supporting customers in product development activities through design, engineering, testing, and homologation services, established the Digital Solutions department. This strategic move …

Enhancing AlloyDB vector search with inline filtering and enterprise observability

Many specialized vector databases today require you to create complex pipelines and applications in order to get the data you need. AlloyDB for PostgreSQL offers Google Research’s, state-of-the-art vector search index, ScaNN, enabling you to optimize the end-to-end retrieval of the most fresh, relevant data with a single SQL statement. Today, we are introducing a …

Auto-Completion Style Text Generation with GPT-2 Model

This post is in six parts; they are: • Traditional vs Neural Approaches • Auto-Complete Architecture • Basic Auto-Complete Implementation • Caching and Batched Input When you type in a word in Google’s search bar, such as “machine”, you may find some additional words are suggested, such as “learning,” to make up “machine learning”.

ML 18221 image001

Mistral-Small-24B-Instruct-2501 is now available on SageMaker Jumpstart and Amazon Bedrock Marketplace

Today, we’re excited to announce that Mistral-Small-24B-Instruct-2501—a twenty-four billion parameter large language model (LLM) from Mistral AI that’s optimized for low latency text generation tasks—is available for customers through Amazon SageMaker JumpStart and Amazon Bedrock Marketplace. Amazon Bedrock Marketplace is a new capability in Amazon Bedrock that developers can use to discover, test, and use over 100 …

Announcing Claude 3.7 Sonnet, Anthropic’s first hybrid reasoning model, is available on Vertex AI

Today, we’re announcing Claude 3.7 Sonnet, Anthropic’s most intelligent model to date and the first hybrid reasoning model on the market, is available in preview on Vertex AI Model Garden. Claude 3.7 Sonnet can produce quick responses or extended, step-by-step thinking that is made visible to the user. Claude 3.7 Sonnet includes improvements in coding, …