A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation

What research can be pursued with small models trained to complete true programs? Typically, researchers study program synthesis via large language models (LLMs) which introduce issues such as knowing what is in or out of distribution, understanding fine-tuning effects, understanding the effects of tokenization, and higher demand on compute and storage to carry out experiments. …

1at60qZXd0j6SphkjzdmazQ

Scaling LLM Post-Training at Netflix

Baolin Li, Lingyi Liu, Binh Tang, Shaojing Li Introduction Pre-training gives Large Language Models (LLMs) broad linguistic ability and general world knowledge, but post-training is the phase that actually aligns them to concrete intents, domain constraints, and the reliability requirements of production environments. At Netflix, we are exploring how LLMs can enable new member experiences across …

ML 20506 image001

Customize AI agent browsing with proxies, profiles, and extensions in Amazon Bedrock AgentCore Browser

AI agents that browse the web need more than basic page navigation. Our customers tell us they need agents that maintain session state across interactions, route traffic through corporate proxy infrastructure, and run with custom browser configurations. AgentCore Browser provides a secure, isolated browser environment for your agents to interact with web applications. Until now, …

From flattery to debate: Training AI to mirror human reasoning

Generative artificial intelligence systems often work in agreement, complimenting the user in its response. But human interactions aren’t typically built on flattery. To help strengthen these conversations, researchers in the USF Bellini College of Artificial Intelligence, Cybersecurity and Computing are challenging the technology to think and debate in ways that resemble human reasoning.

Mapping the Design Space of User Experience for Computer Use Agents

Large language model (LLM)-based computer use agents execute user commands by interacting with available UI elements, but little is known about how users want to interact with these agents or what design factors matter for their user experience (UX). We conducted a two-phase study to map the UX design space for computer use agents. In …

1kVJqZk9DrlcuQJYw3Pue9w

Introducing PFCS Forward

Introducing PFCS Forward: Extending IL5/IL6 Authorization from Cloud to Edge Integrated systems that solve meaningful problems for commanders and their warfighting requirements are essential, according to Lieutenant General Paul T. Stanton, Director of DISA and Commander of DoD Cyber Defense Command, at DISA’s Forecast to Industry 2025 (December 8, 2025) Hardware-Agnostic Accreditation Brings IL5 and IL6 Authorization …

1 xydKq VWpaxyqcwCwQyuA

Automating RDS Postgres to Aurora Postgres Migration

Ram Srivasta Kannan, Wale Akintayo, Jay Bharadwaj, John Crimmins, Shengwei Wang, Zhitao Zhu Introduction In 2024, the Online Data Stores team at Netflix conducted a comprehensive review of the relational database technologies used across the company. This evaluation examined functionality, performance, and total cost of ownership across our database ecosystem. Based on this analysis, we decided …