ExpertLens: Activation Steering Features Are Highly Interpretable

This paper was accepted at the Workshop on Unifying Representations in Neural Models (UniReps) at NeurIPS 2025. Activation steering methods in large language models (LLMs) have emerged as an effective way to perform targeted updates to enhance generated language without requiring large amounts of adaptation data. We ask whether the features discovered by activation steering …

ML 19289 architecture

Connect Amazon Bedrock agents to cross-account knowledge bases

Organizations need seamless access to their structured data repositories to power intelligent AI agents. However, when these resources span multiple AWS accounts integration challenges can arise. This post explores a practical solution for connecting Amazon Bedrock agents to knowledge bases in Amazon Redshift clusters residing in different AWS accounts. The challenge Organizations that build AI …

1 basic n8n setupmax 1000x1000 1

Easy AI workflow automation: Deploy n8n on Cloud Run

n8n is a powerful yet easy-to-use workflow and automation tool for multi-step AI agents, and many teams want a simple, scalable, and cost-effective way to self-host it. With just a few commands, you can deploy n8n to Cloud Run and have it up and running, ready to supercharge your business with AI workflows that can …

How does AI work?

TL;DR: Artificial Intelligence learns patterns from data and uses them to make predictions, generate content, or solve problems. Generative AI, such as ChatGPT or image and video generators, takes this a step further by creating new things, text, art, music, and more, that have never existed before. People often ask: “How does AI actually work?” …

PolyNorm: Few-Shot LLM-Based Text Normalization for Text-to-Speech

Text Normalization (TN) is a key preprocessing step in Text-to-Speech (TTS) systems, converting written forms into their canonical spoken equivalents. Traditional TN systems can exhibit high accuracy, but involve substantial engineering effort, are difficult to scale, and pose challenges to language coverage, particularly in low-resource settings. We propose PolyNorm, a prompt-based approach to TN using …

ML 19938 image

Transform your MCP architecture: Unite MCP servers through AgentCore Gateway

As AI agents are adopted at scale, developer teams can create dozens to hundreds of specialized Model Context Protocol (MCP) servers, tailored for specific agent use case and domain, organization functions or teams. Organizations also need to integrate their own existing MCP servers or open source MCP servers for their AI workflows. There is a …

1 Z5xATZ3max 1000x1000 1

From silicon to softmax: Inside the Ironwood AI stack

As machine learning models continue to scale, a specialized, co-designed hardware and software stack is no longer optional, it’s critical. Ironwood, our latest generation Tensor Processing Unit (TPU), is the cutting-edge hardware behind advanced models like Gemini and Nano Banana, from massive-scale training to high-throughput, low-latency inference. This blog details the core components of Google’s …

image 1 13

How Amazon Search increased ML training twofold using AWS Batch for Amazon SageMaker Training jobs

In this post, we show you how Amazon Search optimized GPU instance utilization by leveraging AWS Batch for SageMaker Training jobs. This managed solution enabled us to orchestrate machine learning (ML) training workloads on GPU-accelerated instance families like P5, P4, and others. We will also provide a step-by-step walkthrough of the use case implementation. Machine …

Build software sustainably in the AI era

Artificial intelligence is reshaping our world – accelerating discovery, optimising systems, and unlocking new possibilities across every sector. But with its vast potential comes a shared responsibility. AI can be a powerful ally for transforming businesses and reducing cost. It can help organizations minimize carbon emissions, industries manage energy use, and scientists model complex climate …

Adapting Self-Supervised Representations as a Latent Space for Efficient Generation

We introduce Representation Tokenizer (RepTok), a generative modeling framework that represents an image using a single continuous latent token obtained from self-supervised vision transformers. Building on a pre-trained SSL encoder, we fine-tune only the semantic token embedding and pair it with a generative decoder trained jointly using a standard flow matching objective. This adaptation enriches …