Categories: Image

All in one WAN 2.2 model merges: 4-steps, 1 CFG, 1 model speeeeed (both T2V and I2V)

I made up some WAN 2.2 merges with the following goals:

  • WAN 2.2 features (including “high” and “low” models)
  • 1 model
  • Simplicity by including VAE and CLIP
  • Accelerators to allow 4-step, 1 CFG sampling
  • WAN 2.1 lora compatibility

… and I think I got something working kinda nicely.

Basically, the models include the “high” and “low” WAN 2.2 models for the first and middle blocks, then WAN 2.1 output blocks. I layer in Lightx2v and PUSA loras for distillation/speed, which allows for 1 CFG @ 4 steps.

Highly recommend sa_solver and beta scheduler. You can use the native “load checkpoint” node.

If you’ve got the hardware, I’m sure you are better off running both big models, but for speed and simplicity… this is at least what I was looking for!

submitted by /u/phr00t_
[link] [comments]

AI Generated Robotic Content

Share
Published by
AI Generated Robotic Content
Tags: ai images

Recent Posts

Just tried animating a Pokémon TCG card with AI – Wan 2.2 blew my mind

Hey folks, I’ve been playing around with animating Pokémon cards, just for fun. Honestly I…

23 hours ago

Busted by the em dash — AI’s favorite punctuation mark, and how it’s blowing your cover

AI is brilliant at polishing and rephrasing. But like a child with glitter glue, you…

24 hours ago

Scientists Have Identified the Origin of an Extraordinarily Powerful Outer Space Radio Wave

In March 2025 the Earth was hit by a fast radio burst as energetic as…

24 hours ago

Robots can now learn to use tools—just by watching us

Despite decades of progress, most robots are still programmed for specific, repetitive tasks. They struggle…

24 hours ago

Sharing that workflow [Remake Attempt]

I took a stab at recreating that person's work but including a workflow. Workflow download…

2 days ago

SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding

We introduce SlowFast-LLaVA-1.5 (abbreviated as SF-LLaVA-1.5), a family of video large language models (LLMs) offering…

2 days ago