| | My setup: RTX 3060 12GB VRAM + 48GB system RAM. I spent the last couple of days messing around with LTX-2 inside ComfyUI and had an absolute blast. I created short sample scenes for a loose spy story set in a neon-soaked, rainy Dhaka (cyberpunk/Bangla vibes with rainy streets, umbrellas, dramatic reflections, and a mysterious female lead). Workflow : https://drive.google.com/file/d/1VYrKf7jq52BIi43mZpsP8QCypr9oHtCO/view Each 8-second scene took about 12 minutes to generate (with synced audio). I queued up 70+ scenes total, often trying 3-4 prompt variations per scene to get the mood right. Some scenes were pure text-to-video, others image-to-video starting from Midjourney stills I generated for consistency. Here’s a compilation of some of my favorite clips (rainy window reflections, coffee steam morphing into faces, walking through crowded neon markets, intense close-ups in the downpour): i cleaned up the audio. it had some squeaky sounds. Strengths that blew me away:
Weaknesses / Things to avoid:
Overall verdict: I literally couldn’t believe how two full days disappeared – I was having way too much fun iterating prompts and watching the queue. LTX-2 feels like a huge step forward for local audio-video gen, especially if you lean into atmospheric/illustrative styles rather than high-action. submitted by /u/tanzim31 |
Listen. I honestly don’t know whether this is just coincidence, a deliberate decision, or simply…
Agentic loops in production can be synonymous with high costs, especially when it comes to…
This post is co written by Ishan Goswami and Nitya Sridhar from Exa. If you…
At Google Cloud Next ’26 we announced Cloud Storage Rapid, a family of object storage…
The former OpenAI chief scientist may be estranged from the company, but he still came…
From August 2026, an EU-wide AI regulation will come into force requiring the labeling of…