FAANG

What Matters in Practical Learned Image Compression

One of the major differentiators unlocked by learned codecs relative to their hard-coded traditional counterparts is their ability to be…

1 week ago

Secure short-term GPU capacity for ML workloads with EC2 Capacity Blocks for ML and SageMaker training plans

As companies of various sizes adopt graphic processing units (GPU)-based machine learning (ML) training, fine-tuning and inference workloads, the demand…

1 week ago

Gemini 3.1 Flash-Lite is now generally available on Gemini Enterprise Agent Platform

Today, we’re thrilled to announce that Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model yet, is…

1 week ago

SpecMD: A Comprehensive Study on Speculative Expert Prefetching

Mixture-of-Experts (MoE) models enable sparse expert activation, meaning that only a subset of the model’s parameters is used during each…

1 week ago

Cost effective deployment of vision-language models for pet behavior detection on AWS Inferentia2

Tomofun, the Taiwan-headquartered pet-tech startup behind the Furbo Pet Camera, is redefining how pet owners interact with their pets remotely.…

1 week ago

Pioneering AI-assisted code migration: How Google achieved 6x faster migration from TensorFlow to JAX

AI coding agents are rapidly becoming ubiquitous across the software industry, fundamentally changing how developers write, test, and debug daily…

1 week ago

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

Serving transformer language models with high throughput requires caching Key-Values (KVs) to avoid redundant computation during autoregressive generation. The memory…

1 week ago

How Hapag-Lloyd uses Amazon Bedrock to transform customer feedback into actionable insights

Hapag-Lloyd stands as one of the world’s leading liner shipping companies, operating a modern fleet of 313 container ships with…

1 week ago

Five must-have guides to move agents into production with Gemini Enterprise Agent Platform

Building AI agents that work well in a demo is one thing, but running them in production requires serious infrastructure. …

1 week ago

PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning

Multi-tool-integrated reasoning enables LLM-empowered tool-use agents to solve complex tasks by interleaving natural-language reasoning with calls to external tools. However,…

2 weeks ago