The biggest change: we integrated model layer streaming across all local inference pipelines, cutting peak VRAM usage enough to run on 16 GB VRAM machines. This has been one of the most requested changes since launch, and it’s live now.
What else is in 1.0.3:
The VRAM reduction is the one we’re most excited about. The higher VRAM requirement locked out a lot of capable desktop hardware. If your GPU kept you on the sideline, try it now and let us know how it works for you on GitHub.
Already using Desktop? The update downloads automatically.
New here? Download
submitted by /u/ltx_model
[link] [comments]
https://x.com/HuggingPapers/status/2055176632491778363 https://huggingface.co/microsoft/Lens https://huggingface.co/microsoft/Lens-Turbo submitted by /u/Total-Resort-3120 [link] [comments]
Organizations that must restrict access to sensitive documents increasingly rely on AI-driven search and chat…
The Gemini Live Agent Challenge is officially in the books! We challenged developers worldwide to…
It’s the best time of year to pick up all the outdoor gadgets, tents, sleeping…
NASA is testing a next-generation space computer chip that could give spacecraft the ability to…
submitted by /u/dr_lm [link] [comments]