FAANG

FastVLM: Efficient Vision encoding for Vision Language Models

Scaling the input image resolution is essential for enhancing the performance of Vision Language Models (VLMs), particularly in text-rich image…

1 year ago

Build a FinOps agent using Amazon Bedrock with multi-agent capability and Amazon Nova as the foundation model

AI agents are revolutionizing how businesses enhance their operational capabilities and enterprise applications. By enabling natural language interactions, these agents…

1 year ago

Introducing Gemini 2.5 Flash

Gemini 2.5 Flash is our first fully hybrid reasoning model, giving developers the ability to turn thinking on or off.

1 year ago

Disentangled Representational Learning with the Gromov-Monge Gap

Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning. Solving it may unlock other problems, such…

1 year ago

Palantir’s Blueprint for Early Career Success in Product Design

Editor’s Note: Product Designers are key members of Palantir product teams. This blog post features a banner by Product Designer…

1 year ago

Add Zoom as a data accessor to your Amazon Q index

For many organizations, vast amounts of enterprise knowledge are scattered across diverse data sources and applications. Organizations across industries seek…

1 year ago

Generate videos in Gemini and Whisk with Veo 2

Transform text-based prompts into high-resolution eight-second videos in Gemini Advanced and use Whisk Animate to turn images into eight-second animated…

1 year ago

Scaling Laws for Native Multimodal Models

Building general-purpose models that can effectively perceive the world through multimodal signals has been a long-standing goal. Current approaches involve…

1 year ago

Automate Amazon EKS troubleshooting using an Amazon Bedrock agentic workflow

As organizations scale their Amazon Elastic Kubernetes Service (Amazon EKS) deployments, platform administrators face increasing challenges in efficiently managing multi-tenant…

1 year ago

FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations

This paper was accepted at the Workshop on Foundation Models in the Wild at ICLR 2025. Visual understanding is inherently…

1 year ago