Dequantized NF4? #2148

iqddd · 2024-10-22T16:53:58Z

iqddd
Oct 22, 2024

Can flux1-dev-bnb-nf4-v2 be dequantized back to dev16 or dev8? There are two reasons for this:

Training tools work with dev16 or dev8. Is there a LoRA tool that supports the base NF4 model?
The bnb-nf4 model and LoRA trained on stock dev16 or dev8 don't work well together. In my experience, it depends on the LoRA and prompts, but overall, it's not very compatible.

Training on a "dequantized" bnb-nf4 version should improve compatibility. In theory.

pflky · 2024-10-22T21:18:24Z

pflky
Oct 22, 2024

You definitely can't dequantize it back to fp8 or fp16, and it'd be pointless anyways because all you'd have is the original Flux dev model. You'd just run the other models anyways. In the settings, FP16 Lora in the top bar is probably what you want to improve not compatibility, but to have the Lora run in a higher precision. It's the precision of the model that causes issues, so training for the model specifically won't change a thing. Less precision is less precision.

You don't really need to run NF4. Forge's memory management means you can easily swap out the full sized model in blocks, as long as you set the memory limit well below your maximum GPU memory, it'll just swap out the model. It'll take a bit longer to generate, but you'll have full compatibility with Loras.

Still I don't find NF4 + Lora to be unworkable. You just need to prompt right and use the appropriate samplers and CFG. Flux in general is extremely sensitive to sampler type, and step count too. The default sampler for Flux is really only good if you want it to follow text easily. 20 steps is not the maximum, it's the bare minimum. Too many people running Flux at only 20 steps and getting subpar results when 40-60 steps can be better.

You aren't going to get everything you want by running low precision and low steps, and it should be of no surprise.

0 replies

iqddd · 2024-10-23T17:34:11Z

iqddd
Oct 23, 2024
Author

The bnb-nf4 Unet has changed during quantization. Pairs of dev16 and nf4 images with the same prompt and seed look different. Some are almost identical, but others are radically different.
LoRA was trained on dev16, causing compatibility issues with bnb-NF4. To fix this, LoRA must be trained on bnb-NF4 or converted to bf16/fp16 (i.e. what I mean by "dequantize").
I don't intend to get the original dev16 as a result of "dequantization". On the contrary! I want to get such an fp16 model, that will be mathematically identical to bnb-nf4, despite the difference in accuracy.

1 reply

dan4ik94 Oct 25, 2024

The bnb-nf4 Unet has changed during quantization. Pairs of dev16 and nf4 images with the same prompt and seed look different. Some are almost identical, but others are radically different. LoRA was trained on dev16, causing compatibility issues with bnb-NF4. To fix this, LoRA must be trained on bnb-NF4 or converted to bf16/fp16 (i.e. what I mean by "dequantize"). I don't intend to get the original dev16 as a result of "dequantization". On the contrary! I want to get such an fp16 model, that will be mathematically identical to bnb-nf4, despite the difference in accuracy.

just choose the appropriate fp16 mode for loras, no need to train anything.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dequantized NF4? #2148

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Dequantized NF4? #2148

iqddd Oct 22, 2024

Replies: 2 comments · 1 reply

pflky Oct 22, 2024

iqddd Oct 23, 2024 Author

dan4ik94 Oct 25, 2024

iqddd
Oct 22, 2024

Replies: 2 comments 1 reply

pflky
Oct 22, 2024

iqddd
Oct 23, 2024
Author