Categories: Image

An experiment with “realism” with Wan2.2 that are safe for work images

Got bored seeing the usual women pics every time I opened this sub so decided to make something a little friendlier for the work place. I was loosely working to a theme of “Scandinavian Fishing Town” and wanted to see how far I could get making them feel “realistic”. Yes I am aware there’s all sorts of jank going on, especially in the backgrounds. So when I say “realistic” I don’t mean “flawless”, just that when your eyes first fall on the image it feels pretty real. Some are better than others.

Key points:

  • Used fp8 for high noise and fp16 for low noise on a 4090, which just about filled vram and ram to the max. Wanted to do purely fp16 but memory was having none of it.
  • Had to separate out the SeedVR2 part of the workflow because Comfy wasn’t releasing the ram, so would just OOM on me on every workflow (64gb ram). Having to manually clear the ram after generating the image and before seedVR2. Yes I tried every “Clear Ram” node I could find and none of them worked. Comfy just hordes the ram until it crashes.
  • I found using res_2m/bong_tangent in the high noise stage would create horrible contrasty images, which is why I went with Euler for the high noise part.
  • It uses a lower step count in the high noise. I didn’t really see much benefit increasing the steps there.

If you see any problems in this setup or have suggestions how I should improve it, please fire away. Especially the low noise. I feel like I’m missing something important there.

Included image of the workflow. Images should have it but I think uploading them here will lose it?

submitted by /u/kemb0
[link] [comments]

AI Generated Robotic Content

Share
Published by
AI Generated Robotic Content
Tags: ai images

Recent Posts

Open source Virtual Try-On LoRA for Flux Klein 9b Edit, hyper precise

Built an open source LoRA for virtual clothing try-on on top of Flux Klein 9b…

2 hours ago

Closing the Gap Between Text and Speech Understanding in LLMs

Large Language Models (LLMs) can be adapted to extend their text capabilities to speech inputs.…

2 hours ago

Build an intelligent photo search using Amazon Rekognition, Amazon Neptune, and Amazon Bedrock

Managing large photo collections presents significant challenges for organizations and individuals. Traditional approaches rely on…

2 hours ago

Here’s What a Google Subpoena Response Looks Like, Courtesy of the Epstein Files

The US Justice Department disclosures give fresh clues about how tech companies handle government inquiries…

3 hours ago

‘Probably’ doesn’t mean the same thing to your AI as it does to you

When a human says an event is "probable" or "likely," people generally have a shared,…

3 hours ago