| | My setup: RTX 3060 12GB VRAM + 48GB system RAM. I spent the last couple of days messing around with LTX-2 inside ComfyUI and had an absolute blast. I created short sample scenes for a loose spy story set in a neon-soaked, rainy Dhaka (cyberpunk/Bangla vibes with rainy streets, umbrellas, dramatic reflections, and a mysterious female lead). Workflow : https://drive.google.com/file/d/1VYrKf7jq52BIi43mZpsP8QCypr9oHtCO/view Each 8-second scene took about 12 minutes to generate (with synced audio). I queued up 70+ scenes total, often trying 3-4 prompt variations per scene to get the mood right. Some scenes were pure text-to-video, others image-to-video starting from Midjourney stills I generated for consistency. Here’s a compilation of some of my favorite clips (rainy window reflections, coffee steam morphing into faces, walking through crowded neon markets, intense close-ups in the downpour): i cleaned up the audio. it had some squeaky sounds. Strengths that blew me away:
Weaknesses / Things to avoid:
Overall verdict: I literally couldn’t believe how two full days disappeared – I was having way too much fun iterating prompts and watching the queue. LTX-2 feels like a huge step forward for local audio-video gen, especially if you lean into atmospheric/illustrative styles rather than high-action. submitted by /u/tanzim31 |
Base model is definitely SOTA, can even easily compete with closed-source ones in terms of…
Generative AI is reshaping how organizations approach productivity, customer experiences, and operational capabilities. Across industries,…
In many ways, the HP OmniBook 5 is a better budget laptop than the MacBook…
University of Washington researchers developed the first system that incorporates tiny cameras in off-the-shelf wireless…
We've pushed an LTX-2.3 update today. The Distilled model has been retrained (now v1.1) with…
The open-weights model ecosystem shifted recently with the release of the