| | Installed VibeVoice using the wrapper this dude created. https://www.reddit.com/r/comfyui/comments/1n20407/wip2_comfyui_wrapper_for_microsofts_new_vibevoice/ Workflow is the multi-voice example one can find in the module’s folder. Asked GPT for a harmless talk among those 3 people, used 3 1-minute audio samples, mono, 44KHz .wav Picked the 7B model. My 3060 almost died, took 54 minutes, but she didn’t croak an OOM error, brave girl resisted, and the results are amazing. This is the first one, no edits, no retries. I’m impressed. submitted by /u/nazihater3000 |
None of the video gen models do a real CRT terminal animation look. Weights +…
Zero-shot text classification is a way to label text without first training a classifier on…
GRASP is a new gradient-based planner for learned dynamics (a “world model”) that makes long-horizon…
Recent work has shown that probing model internals can reveal a wealth of information not…
As the demand for generative AI continues to grow, developers and enterprises seek more flexible,…
An autonomous robot from the company Honor ran a half marathon in 50:26, beating the…