Categories: Image

VibeVoice is crazy good (first try, no cherry-picking)

Installed VibeVoice using the wrapper this dude created.

https://www.reddit.com/r/comfyui/comments/1n20407/wip2_comfyui_wrapper_for_microsofts_new_vibevoice/

Workflow is the multi-voice example one can find in the module’s folder.

Asked GPT for a harmless talk among those 3 people, used 3 1-minute audio samples, mono, 44KHz .wav

Picked the 7B model.

My 3060 almost died, took 54 minutes, but she didn’t croak an OOM error, brave girl resisted, and the results are amazing. This is the first one, no edits, no retries.

I’m impressed.

submitted by /u/nazihater3000
[link] [comments]

AI Generated Robotic Content

Next New method enables AI models to forget private and copyrighted data »

Previous « 7 Pandas Tricks for Efficient Data Merging

Published by

AI Generated Robotic Content

Tags: ai images

11 months ago

Why Lettuce Is Always Making People Sick

The cyclospora diarrhea outbreak isn’t an isolated incident. It’s part of a pattern of leafy…

47 mins ago

AI/ML News

MIT’s new lidar chip could give self-driving cars a wider view

MIT engineers have found a way to give chip-based lidar a wider, clearer view without…

47 mins ago

AI/ML News

AI chatbots can be as effective as humans at emotional support—sometimes better

New research led by The University of Manchester in collaboration with Durham University has found…

47 mins ago

AI/ML Research

The Current State of Agentic AI

In this article, you will learn how agentic AI architecture has evolved by mid-2026, including…

24 hours ago

FAANG

Environment-free Synthetic Data Generation for API-Calling Agents

Training API-calling large language model (LLM) agents demands massive amounts of high-quality trajectories. However, collecting…

24 hours ago

FAANG

Exploring self-distilled reasoning for supervised fine-tuning with Amazon Nova

When you fine-tune a model using Supervised Fine-Tuning (SFT), creating high-quality chain-of-thought (CoT) reasoning traces…

24 hours ago

VibeVoice is crazy good (first try, no cherry-picking)

Recent Posts

Why Lettuce Is Always Making People Sick

MIT’s new lidar chip could give self-driving cars a wider view

AI chatbots can be as effective as humans at emotional support—sometimes better

The Current State of Agentic AI

Environment-free Synthetic Data Generation for API-Calling Agents

Exploring self-distilled reasoning for supervised fine-tuning with Amazon Nova