The biggest change: we integrated model layer streaming across all local inference pipelines, cutting peak VRAM usage enough to run on 16 GB VRAM machines. This has been one of the most requested changes since launch, and it’s live now.
What else is in 1.0.3:
The VRAM reduction is the one we’re most excited about. The higher VRAM requirement locked out a lot of capable desktop hardware. If your GPU kept you on the sideline, try it now and let us know how it works for you on GitHub.
Already using Desktop? The update downloads automatically.
New here? Download
submitted by /u/ltx_model
[link] [comments]
Despite their sophisticated general-purpose capabilities, Large Language Models (LLMs) often fail to align with diverse…
By Renata Teixeira, Zhi Li, Reenal Mahajan, and Wei WeiOn January 26, 2026, we flipped an…
Evaluating single-turn agent interactions follows a pattern that most teams understand well. You provide an…
Building the perfect bra takes thousands of data points. That’s why Honeylove isn’t just another…
In this episode, we discuss Iran’s threats to target US tech firms, gear up for…
Anthropic CEO Dario Amodei has said that AI could surpass "almost all humans at almost…