Categories: Image

WAN2.5-Preview: They are collecting feedback to fine-tune this PREVIEW. The full release will have open training + inference code. The weights MAY be released, but not decided yet. WAN2.5 demands SIGNIFICANTLY more VRAM due to being 1080p and 10 seconds. Final system requirements unknown! (@50:57)

This post summarizes a very important livestream with a WAN engineer. It will at least be partially open (model architecture, training code and inference code). Maybe even fully open weights if the community treats them with respect and gratitude, which is also what one of their engineers basically spelled out on Twitter a few days ago, where he asked us to voice our interest in an open model but in a calm and respectful way, because any hostility makes it less likely that the company releases it openly.

The cost to train this kind of model is millions of dollars. Everyone be on your best behaviors. We’re all excited and hoping for the best! I’m already grateful that we’ve been blessed with WAN 2.2 which is already amazing.

PS: The new 1080p/10 seconds mode will probably be far outside consumer hardware reach, but the improvements in the architecture at 480/720p are exciting enough already. It creates such beautiful videos and really good audio tracks. It would be a dream to see a public release, even if we have to quantize it heavily to fit all that data into our consumer GPUs. 😅

submitted by /u/pilkyton
[link] [comments]

AI Generated Robotic Content

Share
Published by
AI Generated Robotic Content
Tags: ai images

Recent Posts

Brad Pitt casts Elliot for Achilles – an Ai acting performance experiment

I am putting most of my efforts to achieve more realistic Ai acting with natural…

2 hours ago

New light-based switch could cut chip energy use and speed future AI photonics

Photonic devices are hardware systems that can process information using light instead of electricity. These…

3 hours ago

Microsoft Lens First Tests: It’s Pretty Decent! – ComfyUI Native Support About to Be Merged

Model weights: https://huggingface.co/Comfy-Org/Lens PR: https://github.com/Comfy-Org/ComfyUI/pull/14077 You'll need to git the merge pull request if you're…

1 day ago

Tencent released Z-Image 6B with pixel space gen. No VAE & 1k Resolution.

Link: https://nju-pcalab.github.io/projects/L2P/ submitted by /u/switch2stock [link] [comments]

2 days ago

Building Context-Aware Search in Python with LLM Embeddings + Metadata

Keyword search breaks the moment a user types something a document doesn't literally say.

2 days ago

The Blueprint: How Movix fills a gap in dental skills with specialized agentic AI

Welcome to The Blueprint, a regular feature where we highlight how Google Cloud customers are…

2 days ago