PUSA fails go hard
submitted by /u/JackKerawock [link] [comments]
submitted by /u/JackKerawock [link] [comments]
https://huggingface.co/bytedance-research/UMO https://arxiv.org/pdf/2509.06818 Bytedance have released 3 days ago their image editing/creation model UMO. From their huggingface description: Recent advancements in image customization exhibit a wide range of application prospects due to stronger customization capabilities. However, since we humans are more sensitive to faces, a significant challenge remains in preserving consistent identity while avoiding identity confusion …
https://arxiv.org/abs/2509.07295 “We introduce Reconstruction Alignment (RecA), a resource-efficient post-training method that leverages visual understanding encoder embeddings as dense “text prompts,” providing rich supervision without captions. Concretely, RecA conditions a UMM on its own visual understanding embeddings and optimizes it to reconstruct the input image with a self-supervised reconstruction loss, thereby realigning understanding and generation.” https://huggingface.co/sanaka87/BAGEL-RecA …
Read more “RecA: A new finetuning method that doesn’t use image captions.”
submitted by /u/-Ellary- [link] [comments]
I created this animation as part of my tests to find the balance between image quality and motion in low-step generation. By combining LightX Loras, I think I’ve found the right combination to achieve motion that isn’t slow, which is a common problem with LightX Loras. But I still need to work on the image …
So many posts with actual new model releases and technical progression, why can’t we go back to the good old times where people just posted random waifus? /s Just uses the standard Wan 2.2 I2V workflow with a wildcard prompt like the following repeated 4 or 5 times: {hand pops|moving her body and shaking her …
Read more “This sub has had a distinct lack of dancing 1girls lately”
Hey guys, I just tested out the new HunyuanImage 2.1 model on HF and… wow. It’s completely uncensored. It even seems to actually understand male/female anatomy, which is kinda wild compared to most other models out there. Do you think this could end up being a serious competitor to Chroma? From what I’ve seen, there …
Patreon Blog Post CivitAI Download Hey all, as promised here is that Outfit Try On Qwen Image edit LORA I posted about the other day. Thank you for all your feedback and help I truly believe this version is much better for it. The goal for this version was to match the art styles best …
Read more “Clothes Try On (Clothing Transfer) – Qwen Edit Loraa”
I know wan can be used with pose estimators for TextV2V, but I’m unsure about reference images to videos. The only one I know that can use ref image to video is Unianimate. A workflow or resources for this in Wan Vace would be super helpful! submitted by /u/Fresh_Sun_1017 [link] [comments]
Made with Kijai’s infiniteTalk workflow and Higgs Audio for the voice. https://github.com/kijai/ComfyUI-WanVideoWrapper/blob/main/example_workflows/wanvideo_I2V_InfiniteTalk_example_02.json https://huggingface.co/bosonai/higgs-audio-v2-generation-3B-base submitted by /u/Race88 [link] [comments]