Nava – A 6.3B audio-video model .

Page: https://ernie-research.github.io/NAVA/ Model: https://huggingface.co/ernie-research/NAVA Github: https://github.com/ernie-research/NAVA NAVA is a 6.3 B-parameter joint audio-video generator that synthesizes synchronized video and audio from a single prompt — including multi-speaker speech with reference-timbre control and image-conditioned continuations. Instead of post-hoc-aligned dual towers or fully unified tri-modal stacks, NAVA uses an Align-then-Fuse MMDiT: a dedicated alignment space first establishes …

Using depth maps and weight noising to get better character LoRAs

A few weeks ago I introduced a new method for training style LoRAs which has been quite successful. A bunch of folks asked if this would also help with character training. The short answer is yes, but it needed a separate technique on top of the depth stuff. I’ve got something dialed in well enough …

Anima-Base is magic and i don’t think people realize how good it is.

I made a post about ZIT earlier this month, but i think its time ANIMA gets a post aswell. Every image is made by me and made with ONLY anima-base-1, NO loras. Below i shared the CivitAI posts so you can find prompts and in some cases the ComfyUI workflows aswell. This model is insane …

Testing ZIT and Flux-1 with “NVIDIA PiD — Pixel Diffusion Decoder”

Just tested NVIDIA-PiD with 512px generated images and 1024 generated image downscaled to 512, because I think this way the comparison is more balanced since 512 generations will always have less details. (PiD was trained with 512px inputs) I used https://github.com/tsolful/ComfyUI-PiD to test it. There is this other one I just came to know: https://github.com/Merserk/ComfyUI-PiD …

Microsoft Lens First Tests: It’s Pretty Decent! – ComfyUI Native Support About to Be Merged

Model weights: https://huggingface.co/Comfy-Org/Lens PR: https://github.com/Comfy-Org/ComfyUI/pull/14077 You’ll need to git the merge pull request if you’re in a hurry: git fetch origin pull/14077/head:pr-14077 git checkout pr-14077 Supported Resolutions (Width × Height): Base resolution = 1024 Aspect Ratio Resolution (width × height) 1:2 736 × 1472 9:16 768 × 1376 2:3 832 × 1248 3:4 864 × …

Extreme realism with Klein 9B distilled 2 loras together

Depois de gerar vários prompts e combinar vários LoRas, tentei tudo o que você pode imaginar até descobrir que dois LoRas juntos trazem um nível extraordinário de realismo ao Klein 9b Distilled. Eu já estava usando o LoRa “Smartphone Snapshot Photo Reality”, que era o mais realista que eu havia usado, e ele sozinho já …