Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments

Large-scale commercial search systems optimize for relevance to drive successful sessions that help users find what they are looking for. To maximize relevance, we leverage two complementary objectives: behavioral relevance (results users tend to click or download) and textual relevance (a result’s semantic fit to the query). A persistent challenge is the scarcity of expert-provided …

asa kalavde aws 100x128 1

Learnings from COBOL modernization in the real world

There’s a lot of excitement right now about AI enabling mainframe application modernization. Boards are paying attention. CIOs are getting asked for a plan. AI is a genuine accelerator for COBOL modernization but to get results, AI needs additional context that source code alone can’t provide.Here’s what we’ve learned working with 400+ enterprise customers: mainframe …

PayPal’s historically large data migration is the foundation for its gen AI innovation

With the dawn of the gen AI era, businesses are facing unprecedented opportunities for transformative products, demanding a strategic shift in their technology infrastructure. A few years ago, PayPal, a digital-native company serving hundreds of millions of customers, faced a significant challenge. After 25 years of success in expanding services and capabilities, we’d created complexity …

Constructive Circuit Amplification: Improving Math Reasoning in LLMs via Targeted Sub-Network Updates

Prior studies investigating the internal workings of LLMs have uncovered sparse subnetworks, often referred to as circuits, that are responsible for performing specific tasks. Additionally, it has been shown that model performance improvement through fine-tuning often results from the strengthening of existing circuits in the model. Taken together, these findings suggest the possibility of intervening …

image 1 4

Efficiently serve dozens of fine-tuned models with vLLM on Amazon SageMaker AI and Amazon Bedrock

Organizations and individuals running multiple custom AI models, especially recent Mixture of Experts (MoE) model families, can face the challenge of paying for idle GPU capacity when the individual models don’t receive enough traffic to saturate a dedicated compute endpoint. To solve this problem, we have partnered with the vLLM community and developed an efficient …

1 ylYpswmmax 1000x1000 1

A developer’s guide to production-ready AI agents

Something has shifted in the developer community over the past year. AI agents have moved from “interesting research concept” to “thing my team is actually building.” The prototypes are working. The demos are impressive. And now comes the harder question: How do we ship this? That question turns out to be a multi-part one. Agents …

Closing the Gap Between Text and Speech Understanding in LLMs

Large Language Models (LLMs) can be adapted to extend their text capabilities to speech inputs. However, these speech-adapted LLMs consistently underperform their text-based counterparts—and even cascaded pipelines—on language understanding tasks. We term this shortfall the text-speech understanding gap: the performance drop observed when a speech-adapted LLM processes spoken inputs relative to when the original text-based …

p1

Build an intelligent photo search using Amazon Rekognition, Amazon Neptune, and Amazon Bedrock

Managing large photo collections presents significant challenges for organizations and individuals. Traditional approaches rely on manual tagging, basic metadata, and folder-based organization, which can become impractical when dealing with thousands of images containing multiple people and complex relationships. Intelligent photo search systems address these challenges by combining computer vision, graph databases, and natural language processing …

Apple Workshop on Reasoning and Planning 2025

Reasoning and planning are the bedrock of intelligent AI systems, enabling them to plan, interact, adapt, and ultimately, operate independently. At Apple, understanding and advancing reasoning capablilities in AI systems has long been an area of active research, and has resulted in numerous publications that both explore new techniques to advance the frontier of reasoning, …

MediaFM: The Multimodal AI Foundation for Media Understanding at Netflix

Avneesh Saluja, Santiago Castro, Bowei Yan, Ashish Rastogi Introduction Netflix’s core mission is to connect millions of members around the world with stories they’ll love. This requires not just an incredible catalog, but also a deep, machine-level understanding of every piece of content in that catalog, from the biggest blockbusters to the most niche documentaries. As …