How does AI work?

TL;DR: Artificial Intelligence learns patterns from data and uses them to make predictions, generate content, or solve problems. Generative AI, such as ChatGPT or image and video generators, takes this a step further by creating new things, text, art, music, and more, that have never existed before. People often ask: “How does AI actually work?” …

PolyNorm: Few-Shot LLM-Based Text Normalization for Text-to-Speech

Text Normalization (TN) is a key preprocessing step in Text-to-Speech (TTS) systems, converting written forms into their canonical spoken equivalents. Traditional TN systems can exhibit high accuracy, but involve substantial engineering effort, are difficult to scale, and pose challenges to language coverage, particularly in low-resource settings. We propose PolyNorm, a prompt-based approach to TN using …

ML 19938 image

Transform your MCP architecture: Unite MCP servers through AgentCore Gateway

As AI agents are adopted at scale, developer teams can create dozens to hundreds of specialized Model Context Protocol (MCP) servers, tailored for specific agent use case and domain, organization functions or teams. Organizations also need to integrate their own existing MCP servers or open source MCP servers for their AI workflows. There is a …

1 Z5xATZ3max 1000x1000 1

From silicon to softmax: Inside the Ironwood AI stack

As machine learning models continue to scale, a specialized, co-designed hardware and software stack is no longer optional, it’s critical. Ironwood, our latest generation Tensor Processing Unit (TPU), is the cutting-edge hardware behind advanced models like Gemini and Nano Banana, from massive-scale training to high-throughput, low-latency inference. This blog details the core components of Google’s …

image 1 13

How Amazon Search increased ML training twofold using AWS Batch for Amazon SageMaker Training jobs

In this post, we show you how Amazon Search optimized GPU instance utilization by leveraging AWS Batch for SageMaker Training jobs. This managed solution enabled us to orchestrate machine learning (ML) training workloads on GPU-accelerated instance families like P5, P4, and others. We will also provide a step-by-step walkthrough of the use case implementation. Machine …

Build software sustainably in the AI era

Artificial intelligence is reshaping our world – accelerating discovery, optimising systems, and unlocking new possibilities across every sector. But with its vast potential comes a shared responsibility. AI can be a powerful ally for transforming businesses and reducing cost. It can help organizations minimize carbon emissions, industries manage energy use, and scientists model complex climate …

Adapting Self-Supervised Representations as a Latent Space for Efficient Generation

We introduce Representation Tokenizer (RepTok), a generative modeling framework that represents an image using a single continuous latent token obtained from self-supervised vision transformers. Building on a pre-trained SSL encoder, we fine-tune only the semantic token embedding and pair it with a generative decoder trained jointly using a standard flow matching objective. This adaptation enriches …

10I8DAvXCQEN1RpTTC0ZyaQ

Supercharging the ML and AI Development Experience at Netflix

Supercharging the ML and AI Development Experience at Netflix with Metaflow Shashank Srikanth, Romain Cledat Metaflow — a framework we started and open-sourced in 2019 — now powers a wide range of ML and AI systems across Netflix and at many other companies. It is well loved by users for helping them take their ML/AI workflows from prototype to production, allowing …

Screenshot 2025 11 04 at 101509AM

Iterate faster with Amazon Bedrock AgentCore Runtime direct code deployment

Amazon Bedrock AgentCore is an agentic platform for building, deploying, and operating effective agents securely at scale. Amazon Bedrock AgentCore Runtime is a fully managed service of Bedrock AgentCore, which provides low latency serverless environments to deploy agents and tools. It provides session isolation, supports multiple agent frameworks including popular open-source frameworks, and handles multimodal …

Policy Maps: Tools for Guiding the Unbounded Space of LLM Behaviors

AI policy sets boundaries on acceptable behavior for AI models, but this is challenging in the context of large language models (LLMs): how do you ensure coverage over a vast behavior space? We introduce policy maps, an approach to AI policy design inspired by the practice of physical mapmaking. Instead of aiming for full coverage, …