Categories: Image

Using depth maps and weight noising to get better character LoRAs

A few weeks ago I introduced a new method for training style LoRAs which has been quite successful. A bunch of folks asked if this would also help with character training. The short answer is yes, but it needed a separate technique on top of the depth stuff. I’ve got something dialed in well enough to share, though it’s still experimental and I want feedback to help find the optimal settings.

The new mechanism is weight noising. It’s a small Gaussian perturbation injected directly into the LoRA weights at each training step. A simple way to think of it is that it helps the model “forget” mistakes during training and only keep things that are consistent in the data. More technically, it biases training toward flatter loss minima and spreads learning across more singular directions of the LoRA factorization (I measured +20% stable rank on the same config without it). The practical effect is that it resists the memorization that usually overcooks character runs, and likeness comes out substantially better at the same step count.

The post image shows an example training on actress Clare Bowen, who has uniquely recognizable features but is not known by Flux. This is using a training set of 8 images, the same training step count (750), and same model. The standard run is in the middle, the new method is on the right.

The settings are identical for both runs except one has weight noise and depth anchoring, along with a different number of repeats for each bucket size:

Batch 4, LR 5e-5
Image size buckets of 512, 768, 1024
LoKr factor 8
AdamW8bit, 1200 steps total (but best checkpoint at 750)

The differing number of images per bucket is actually a good training trick on its own, and I updated my trainer to make this easier by allowing you to specify how many repeats of each image per bucket.

Things I’m still working out and would love feedback on:

Optimal sigma across dataset sizes — using 0.00125 has gotten the best results, and I’m pretty sure the right value scales with dataset size and batch size but I haven’t fully mapped it.
Whether weight noising compounds well with other character LoRA tricks people are using.

I’ve also added Docker support so you can more easily run this on Runpod.

Repo: https://github.com/BuffaloBuffaloBuffaloBuffalo/ai-toolkit-perceptual

Finally, the new-job page now has a “Quickstart Template” dropdown at the top that loads the best character config end-to-end. It defaults to the HuggingFace Flux 2 Klein 9B checkpoint but you can also use your own checkpoint. Still plenty of UI cleanup to do on my end, so pardon the mess!

Happy to answer questions and help troubleshoot here or in DMs.

EDIT: One important thing to know about captioning. You will likely get the best results if you use the built-in subject masking feature, which masks out the background. If you use this, it is important that your captions ONLY describe the character, NOT the setting. You may also use just a trigger phrase with subject masking, but your results will be less promptable. I have added quickstart configs for both masked and unmasked.

submitted by /u/QuantumBogoSort
[link] [comments]

Z Image Base Knows Things and Can Deliver

Just a few samples from a lora trained using Z image base. First 4 pictures are generated using Z image turbo and the last 3 are using Z image base + 8 step distilled lora Lora is trained using almost 15000 images using ai toolkit (here is the config: https://www.reddit.com/r/StableDiffusion/comments/1qshy5a/comment/o2xs8vt/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button…

February 6, 2026

In "Image"

SamsungCam UltraReal – Qwen-Image LoRA

Hey everyone, Just dropped the first version of a LoRA I've been working on: SamsungCam UltraReal for Qwen-Image. If you're looking for a sharper and higher-quality look for your Qwen-Image generations, this might be for you. It's designed to give that clean, modern aesthetic typical of today's smartphone cameras. It's…

October 5, 2025

In "Image"

Simpletuner creator is reporting N S F W loras on huggingface and they are being removed. The community needs to look elsewhere to post controversial loras

There is a Flux Fill link to remove clothes that was on the site several months ago. And today it disappeared. Until recently it was not common for hugginface to remove anything submitted by /u/More_Bid_2197 [link] [comments]

July 5, 2025

In "Image"

AI Generated Robotic Content