Categories: FAANG

To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models

State Space Models (SSMs) have become the leading alternative to Transformers for sequence modeling. Their primary advantage is efficiency in long-context and long-form generation, enabled by fixed-size memory and linear scaling of computational complexity. We begin this work by showing a simple theoretical result stating that SSMs cannot accurately solve any “truly long-form” generation problem (in a sense we formally define), undermining their main competitive advantage. However, we show that this limitation can be mitigated by allowing SSMs interactive access to external tools. In fact, we…

STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows

Normalizing flows (NFs) are end-to-end likelihood-based generative models for continuous data, and have recently regained attention with encouraging progress on image generation. Yet in the video generation domain, where spatiotemporal complexity and computational cost are substantially higher, state-of-the-art systems almost exclusively rely on diffusion-based models. In this work, we revisit…

May 1, 2026

In "FAANG"

FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models

Autoregressive language models (ARMs) deliver strong likelihoods, but are inherently serial: they generate one token per forward pass, which limits throughput and inflates latency for long sequences. Diffusion Language Models (DLMs) parallelize across positions and thus appear promising for language generation, yet standard discrete diffusion typically needs hundreds to thousands…

October 14, 2025

In "FAANG"

SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding

We introduce SlowFast-LLaVA-1.5 (abbreviated as SF-LLaVA-1.5), a family of video large language models (LLMs) offering a token-efficient solution for long-form video understanding. We incorporate the two-stream SlowFast mechanism into a streamlined training pipeline, and perform joint video-image training on a carefully curated data mixture of only publicly available datasets. Our…

August 23, 2025

In "FAANG"

AI Generated Robotic Content

Next LlamaAgents Builder: From Prompt to Deployed AI Agent in Minutes »

Previous « How to build production-ready AI agents with Google-managed MCP servers

Published by

AI Generated Robotic Content

Tags: ai/mlfaang

2 months ago

Using depth maps and weight noising to get better character LoRAs

A few weeks ago I introduced a new method for training style LoRAs which has…

14 hours ago

AI/ML Research

The Statistics of Token Selection: Logits, Temperature, and Top-P Walkthrough

When large language models, or LLMs for short, produce outputs, several criteria are at stake,…

14 hours ago

FAANG

Process financial documents using Amazon Bedrock Data Automation

Financial institutions process thousands of documents daily, including tax forms, loan statements, and purchase orders.…

14 hours ago

FAANG

Introducing Google AI Threat Defense to help you outpace the adversary

aside_block <ListValue: [StructValue([('title', 'Summary of today’s news'), ('body', <wagtail.rich_text.RichText object at 0x7f00683723a0>), ('btn_text', ''), ('href',…

15 hours ago