We just shipped a new LTX-2 drop focused on one thing: making video generation easier to iterate on without killing VRAM, consistency, or sync.
If you’ve been frustrated by LTX because prompt iteration was slow or outputs felt brittle, this update is aimed directly at that.
Here’s the highlights, the full details are here.
Faster prompt iteration (Gemma text encoding nodes)
Why you should care: no more constant VRAM loading and unloading on consumer GPUs.
New ComfyUI nodes let you save and reuse text encodings, or run Gemma encoding through our free API when running LTX locally.
This makes Detailer and iterative flows much faster and less painful.
Independent control over prompt accuracy, stability, and sync (Multimodal Guider)
Why you should care: you can now tune quality without breaking something else.
The new Multimodal Guider lets you control:
Each can be tuned independently, per modality. No more choosing between “follows the prompt” and “doesn’t fall apart.”
More practical fine-tuning + faster inference
Why you should care: better behavior on real hardware.
Trainer updates improve memory usage and make fine-tuning more predictable on constrained GPUs.
Inference is also faster for video-to-video by downscaling the reference video before cross-attention, reducing compute cost. (Speedup depend on resolution and clip length.)
We’ve also shipped new ComfyUI nodes and a unified LoRA to support these changes.
This drop isn’t a one-off. The next LTX-2 version is already in progress, focused on:
More on what’s coming up here.
If you’re pushing LTX-2 in real workflows, your feedback directly shapes what we build next. Try the update, break it, and tell us what still feels off in our Discord.
submitted by /u/ltx_model
[link] [comments]
It's exactly as dumb and as it looks and sounds; slap googly eyes on anyone.…
The word “staged” exploded on social media following the attack, as both right- and left-wing…
Hey all. I've been working on WaTale, a visual novel app powered by local AI.…
Distractions? What distractions? Here are our recommendations for apps that help you stay focused on…
Hi r/StableDiffusion, Today we’re excited to share that Comfy has raised $30M at a $500M…
Understanding and predicting motion is a fundamental component of visual intelligence. Although modern video models…