ai/ml

FastVLM: Efficient Vision encoding for Vision Language Models

Scaling the input image resolution is essential for enhancing the performance of Vision Language Models (VLMs), particularly in text-rich image…

11 months ago

Build a FinOps agent using Amazon Bedrock with multi-agent capability and Amazon Nova as the foundation model

AI agents are revolutionizing how businesses enhance their operational capabilities and enterprise applications. By enabling natural language interactions, these agents…

11 months ago

Introducing Gemini 2.5 Flash

Gemini 2.5 Flash is our first fully hybrid reasoning model, giving developers the ability to turn thinking on or off.

11 months ago

Disentangled Representational Learning with the Gromov-Monge Gap

Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning. Solving it may unlock other problems, such…

11 months ago

Palantir’s Blueprint for Early Career Success in Product Design

Editor’s Note: Product Designers are key members of Palantir product teams. This blog post features a banner by Product Designer…

11 months ago

Add Zoom as a data accessor to your Amazon Q index

For many organizations, vast amounts of enterprise knowledge are scattered across diverse data sources and applications. Organizations across industries seek…

11 months ago

Generate videos in Gemini and Whisk with Veo 2

Transform text-based prompts into high-resolution eight-second videos in Gemini Advanced and use Whisk Animate to turn images into eight-second animated…

11 months ago

Scaling Laws for Native Multimodal Models

Building general-purpose models that can effectively perceive the world through multimodal signals has been a long-standing goal. Current approaches involve…

11 months ago

Automate Amazon EKS troubleshooting using an Amazon Bedrock agentic workflow

As organizations scale their Amazon Elastic Kubernetes Service (Amazon EKS) deployments, platform administrators face increasing challenges in efficiently managing multi-tenant…

11 months ago

FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations

This paper was accepted at the Workshop on Foundation Models in the Wild at ICLR 2025. Visual understanding is inherently…

11 months ago