Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more Upscalers #14794

Draft
wants to merge 7 commits into
base: dev
Choose a base branch
from
Draft

Conversation

Cyberbeing
Copy link
Contributor

@Cyberbeing Cyberbeing commented Jan 29, 2024

Description

DRAFT PULL REQUEST (for testing and discussion only)
Now that webui supports spandrel, we should consider if we want to support any other model types with community models available on https://openmodeldb.info/. This draft pull request contains the commits which I've testing locally for the past week, and thought others may be interested in testing these other model formats with webui as well.

Overall my favorite models thus far are Nomos8kHAT-L_otf for realistic content (imho superior to FaceUpDAT/FaceUpDatSharper) and 4xHFA2kLUDVAESRFormer_light (sharp texture detail, strong blur removal) / 4xHFA2kLUDVAEGRL_small (higher saturation, smoother texture detail, less artifacts) or Real_HAT_GAN_Sharper for anime. The problem with the HAT-L models (originally added in c756133) is despite their quality, they are slow and use more VRAM than any other model type. Keep this in mind when setting your Tile Sizes.

  • Added support for COMPACT, GRL, OmniSR, SPAN, and SRFormer upscale models.
    Add the upscaler models into folders with these names within \stable-diffusion-webui\models
  • Refactored ESRGAN, HAT, and DAT so they are all unified using the same template as the new models.
    Template format is based vaguely off merging the best parts of hat_model.py with esrgan_model.py and eliminating url loading which I see as unneeded.
  • Fixed prefer_half and added a placeholder for bfloat16
    bfloat16 is non-functional as-is, since passing a bfloat16 model to spandrel results in float/bfloat16 mismatch errors
    if desired, it can be made to function by adding bfloat16 autocasts/casts within spandrel at the point of failure
    * Switched upscaler_utils.py to use Torchvision transforms instead of Numpy directly
    What prompted this was when I was comparing chaiNNer output to webui, the webui output had strange noise on output for unknown reasons. Upon switching to using Torchvision transforms in webui this noise went away. Likely requires more testing.
  • Increase the Maximum Tile Size to 4096x4096 similar to chaiNNer.
    Adapt the Tile/Padding Size Step in the UI for each model architecture's recommendations to maximize quality.
  • Allow upscaler models to be loaded in Safetensors format
    I have not yet tested this
    * Switched default latent scale mode to Latent (nearest-exact)
    I remember hearing a legacy argument that bugged "nearest" mode need to be continued to used since older models were trained off the incorrect behavior. Does this really matter? I also changed the default latent scale mode for I2I since there is currently no way to configure it there, and nearest-exact is my personal preference.
  • Added an Option to Unload the Stable Diffusion model from VRAM to RAM during Upscaling
    When not using --medvram or --lowvram, the Stable Diffusion model can take up a large amount of VRAM especially with SDXL and --no-half, which can frequently result in OOM with a constant tile size setting during Upscale. By enabling this option in settings, the SD model will be unloaded prior to upscaling to free up VRAM, and reloaded after upscaling completes. This allows the user to set the highest upscaling tile size possible with their VRAM, and not need to change it when using a SD model requiring more VRAM for itself.
  • Fixed an autocast NaN during upscaling I've noticed rarely on my system during Hires fix as mentioned in:
    [Bug]: DAT upscaler is not working in hires fix. #14710 (comment)
    Split off and merged in Fix potential autocast NaNs in image upscale #14809
  • Fixed a regression in 1.7.0 causing 1x scale models to be non-functional as mentioned in:
    [Bug]: Scale by 1x in 1.7.0 does nothing on "Extras tab" and "SD Upscale" script #14738 (comment)

Checklist:

@w-e-w
Copy link
Collaborator

w-e-w commented Jan 29, 2024

@akx

@akx akx self-requested a review January 29, 2024 18:15
@FurkanGozukara
Copy link

wow some nice improvements thank you

@gel-crabs
Copy link
Contributor

Great work on this! This fixes the black squares/all-black images with non-ESRGAN upscalers. There are a couple issues to iron out though, independent of model type:

  1. The tiled upscale (the part between first and second pass) is much slower, down from 14-16 it/s depending on scale to around 5 it/s.
  2. With non-standard upscale resolutions, say 1344x768 to 1920x1080, the image gets rather messed up; one half of the image is also significantly noisier than the other.

@gel-crabs
Copy link
Contributor

Another note about the second issue: it is caused by the VAE, as upscaling works as expected when VAE encode is set to TAESD.

@akx
Copy link
Collaborator

akx commented Jan 30, 2024

Cool – I have a WIP branch that'd refactor the entire upscaling infrastructure to something a little saner.

That won't block merging this, but just a heads-up that there's a bit of duplicate work in here and in there.

Copy link
Collaborator

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the other comment, I've been working on reworking the upscaler architecture to something saner, but that doesn't necessarily block this – I'll just take care to take all of this into account.

Beyond the inline comments, I think all of the new Spandrel upscalers would benefit from a common superclass or mixin, they're mostly copy-pasted now anyway?

modules/upscaler_utils.py Outdated Show resolved Hide resolved
Comment on lines 14 to 35
def pil_image_to_torch_bgr(img: Image.Image) -> torch.Tensor:
img = np.array(img.convert("RGB"))
img = img[:, :, ::-1] # flip RGB to BGR
img = np.transpose(img, (2, 0, 1)) # HWC to CHW
img = np.ascontiguousarray(img) / 255 # Rescale to [0, 1]
return torch.from_numpy(img)


def torch_bgr_to_pil_image(tensor: torch.Tensor) -> Image.Image:
if tensor.ndim == 4:
# If we're given a tensor with a batch dimension, squeeze it out
# (but only if it's a batch of size 1).
if tensor.shape[0] != 1:
raise ValueError(f"{tensor.shape} does not describe a BCHW tensor")
tensor = tensor.squeeze(0)
assert tensor.ndim == 3, f"{tensor.shape} does not describe a CHW tensor"
# TODO: is `tensor.float().cpu()...numpy()` the most efficient idiom?
arr = tensor.float().cpu().clamp_(0, 1).numpy() # clamp
arr = 255.0 * np.moveaxis(arr, 0, 2) # CHW to HWC, rescale
arr = arr.round().astype(np.uint8)
arr = arr[:, :, ::-1] # flip BGR to RGB
return Image.fromarray(arr, "RGB")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like it if these function names were retained, even if they did just call the Torchvision transforms. The assertions and comments are there for a reason too :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the time being I've just removed the Torchvision transforms. I noticed when I was testing a DAT model, there was a large color shift. I'm unclear if this is just because I had added FP32 conversions and clamps than your original code or something else entirely. I should probably check against ChaNNier later to see which was the closest match.

It would appear compared to your code, the Torchvision functions I used did the following internally, which I then clamped every time after converting to float32 Tensors before converting to PIL Image:

ToPILImage
...
pic = pic.permute((1, 2, 0))
pic = pic.numpy(force=True)
npimg = pic
if np.issubdtype(npimg.dtype, np.floating) and mode != "F":
    npimg = (npimg * 255).astype(np.uint8)
if mode is None and npimg.dtype == np.uint8:
    mode = "RGB"
return Image.fromarray(npimg, mode=mode)
...
PILToTensor
...
img = torch.as_tensor(np.array(pic, copy=True))
img = img.view(pic.size[1], pic.size[0], F_pil.get_image_num_channels(pic))
img = img.permute((2, 0, 1))
return img
...
ToDtype
...
if float_input:
    # float to float
    if float_output:
        return image.to(dtype)
    # float to int    
    eps = 1e-3
    max_value = float(_max_value(dtype))
    return image.mul(max_value + 1.0 - eps).to(dtype)
else:
    # int to float
    if float_output:
        return image.to(dtype).mul_(1.0 / _max_value(image.dtype))
    # int to int
    num_value_bits_input = _num_value_bits(image.dtype)
    num_value_bits_output = _num_value_bits(dtype)

    if num_value_bits_input > num_value_bits_output:
        return image.bitwise_right_shift(num_value_bits_input - num_value_bits_output).to(dtype)
    else:
        return image.to(dtype).bitwise_left_shift_(num_value_bits_output - num_value_bits_input)
...

modules/upscaler_utils.py Outdated Show resolved Hide resolved
Comment on lines 60 to 72
if img.width >= dest_w and img.height >= dest_h:
if img.width > dest_w and img.height > dest_h:
break
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this (and the other if at the end of the loop) are necessary? 🤔 As I see it, this implementation will do an extra upscale, just to downscale back down with lanczos outside the loop?

Copy link
Contributor Author

@Cyberbeing Cyberbeing Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line at bottom was part of 30b1bcc (1x scale models work with this) . While the new one 4a66638 just moved the line to the top which breaks the loop before upscale has a chance to running under the same conditions of the prior commit. This had the side effect of breaking 1x scale models, which would previously be run at which point the line at bottom would just return the output immediately?

I can only assume the lanczos downscale is intentional? Maybe it is what allows users to get custom output sizes out of 2x/3x/4x/8x upscaling models?

AUTOMATIC1111 would likely need to chime in here if you believe the entire loop is wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as far as I can tell the Lanczos (down)scale is intentional, exactly to get fractional scales out of models that are generally just 2x/3x/4x.

I think at least a code comment is needed to explain the logic to support 1x models – and in the case of an 1x model, it might be better to do the Lanczos upscaling first, then refine the scaled output with an 1x model, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get what you are saying about Lanczos, but do people actually willing scale the image with a 1x models selected rather then just use them in Extras, considering how they are implemented currently?

This really comes back to if you are going to be refactoring the scaling code, consider adding something to parse and record the model scale somewhere like cache.json after the model is loaded for the first time. Or maybe another option would be to dump them in a separate sub-folder where all models there are implicitly considered as 1x post-processing models and can be handled differently.

Overall I think trying to further adapt this section to 1x models without actually knowing a 1x model is being used is impractical. I'll add a comment there though, explaining what is happening.

@Panchovix
Copy link

Pretty nice PR! Finally being able to use DAT and HAT upscalers without a black image with hires fix.

I can confirm at least what @gel-crabs said, using different resolutions to upscale (like 1.3x) seems to have some noise or blurryness. (using base 896x1088, for example) Setting to 1.25x, 1.5x or 2x seems to work fine (with either ESRGAN, HAT, SwinIR or DAT upscalers I tried)

@akx
Copy link
Collaborator

akx commented Jan 30, 2024

Oh, and by the way @Cyberbeing, if you can, it would be cool if the fixes for black output etc. could be separated into a separate, easier-to-review-and-merge PR, and the new upscalers could be in here :)

@Cyberbeing
Copy link
Contributor Author

@gel-crabs I believe the first issue is as expected, since previously Float32 models which spandrel has marked as the arch not supporting Float16, were being autocast to Float16 anyway only during HiresFix (but not Extras) and occasionally causing NaNs because of it. Disabling autocast for upscaling now causes Float32 only models to upscale entirely in Float32, so a speed difference in HiresFix should be expected, which should now match speed in Extras.

Model arch which Spandrel allows for Float16 such as ESRGAN shouldn't see any significant speed difference before and after.

Spandrel Arch supporting Float32/BFloat16 only: CodeFormer, DAT, GFPGAN, GRL, HAT, SRFormer, SwinIR

Spandrel Arch supporting Float32/Float16/BFloat16: Compact, ESRGAN/RealESRGAN, OmniSR, SPAN.

As for that second issue, I've tried reproducing it but I'm not sure I see it exactly. My local setup is non-standard though, since I am running Torch 2.3.0 nightly and keep most dependencies up-to-date rather than using the pinned versions. Backing thorough the commits, I do see that the nearest-exact and torchvision commits each changed output somewhat, so I've force-pushed the branch without those two commits. @gel-crabs @Panchovix can you still reproduce the issue?
Branch without torchvision & nearest-exact: https://github.com/Cyberbeing/stable-diffusion-webui/tree/upscalers
Original Branch with torchvision & nearest-exact: https://github.com/Cyberbeing/stable-diffusion-webui/tree/upscalers_orig

@gel-crabs
Copy link
Contributor

gel-crabs commented Jan 30, 2024

@gel-crabs I believe the first issue is as expected, since previously Float32 models which spandrel has marked as the arch not supporting Float16, were being autocast to Float16 anyway only during HiresFix (but not Extras) and occasionally causing NaNs because of it. Disabling autocast for upscaling now causes Float32 only models to upscale entirely in Float32, so a speed difference in HiresFix should be expected, which should now match speed in Extras.

Model arch which Spandrel allows for Float16 such as ESRGAN shouldn't see any significant speed difference before and after.

Spandrel Arch supporting Float32/BFloat16 only: CodeFormer, DAT, GFPGAN, GRL, HAT, SRFormer, SwinIR

Spandrel Arch supporting Float32/Float16/BFloat16: Compact, ESRGAN/RealESRGAN, OmniSR, SPAN.

As for that second issue, I've tried reproducing it but I'm not sure I see it exactly. My local setup is non-standard though, since I am running Torch 2.3.0 nightly and keep most dependencies up-to-date rather than using the pinned versions. Backing thorough the commits, I do see that the nearest-exact and torchvision commits each changed output somewhat, so I've force-pushed the branch without those two commits. @gel-crabs @Panchovix can you still reproduce the issue? Branch without torchvision & nearest-exact: https://github.com/Cyberbeing/stable-diffusion-webui/tree/upscalers Original Branch with torchvision & nearest-exact: https://github.com/Cyberbeing/stable-diffusion-webui/tree/upscalers_orig

The new branch without torchvision/nearest-exact works with full VAE!

For whatever reason though, the tiled upscale is still only about half as fast with ESRGAN upscalers despite being in float16.

Another thing to consider for the future is support for ONNX upscalers, as all the HAT models are also distributed with an ONNX version in float16. That probably lies with with Spandrel though.

@Panchovix
Copy link

@Cyberbeing Just tested with odd resolutions and now works fine! It doesn't seem to give me artifacts or blurryness. I'm using torch nightly as well, but from the start of the month, so maybe it doesn't have some features you were using
imagen

@Cyberbeing
Copy link
Contributor Author

python: 3.10.13  •  torch: 2.3.0.dev20240128+cu118  •  xformers: 0.0.21+b885bd5.d20240128
That xformers is my personal pre-FlashV2/Cutlass3 + bugfixes local branch build but I wouldn't think that would matter.

I'm unsure, since for the most part I just looked like output was different in some spots, If anything, I suspect it must have been changing the following line to nearest-exact, since it is related to supported multiple-of-8 dimensions but I'm unsure why it would result in broken output. @Panchovix Can you double-check if changing only that line back to nearest-exact re-triggers your issue?

a = torch.nn.functional.interpolate(a, b.shape[-2:], mode="nearest")

@Panchovix
Copy link

python: 3.10.13  •  torch: 2.3.0.dev20240128+cu118  •  xformers: 0.0.21+b885bd5.d20240128 That xformers is my personal pre-FlashV2/Cutlass3 + bugfixes local branch build but I wouldn't think that would matter.

I'm unsure, since for the most part I just looked like output was different in some spots, If anything, I suspect it must have been changing the following line to nearest-exact, since it is related to supported multiple-of-8 dimensions but I'm unsure why it would result in broken output. @Panchovix Can you double-check if changing only that line back to nearest-exact re-triggers your issue?

a = torch.nn.functional.interpolate(a, b.shape[-2:], mode="nearest")

Testing it changing to nearest-exact and effectively, it triggers the issue again. My torchvision version is torchvision-0.18.0.dev20240102+cu121 in any case.

Using with just "nearest" works perfectly.

@Cyberbeing
Copy link
Contributor Author

@gel-crabs I pushed a new commit now with the autocast disabled in a more limited fashion, as close to the source of the NaN as possible. Does that improve performance at all for you? I unfortunately can't reproduce the performance impact you are seeing with my RTX A4000, when I test the numbers before and after with float16 models are 100% identical. Maybe my system is too slow so the overhead is hidden.

@gel-crabs
Copy link
Contributor

@gel-crabs I pushed a new commit now with the autocast disabled in a more limited fashion, as close to the source of the NaN as possible. Does that improve performance at all for you? I unfortunately can't reproduce the performance impact you are seeing with my RTX A4000, when I test the numbers before and after with float16 models are 100% identical. Maybe my system is too slow so the overhead is hidden.

Yes, ESRGAN's performance is back to normal now

@Cyberbeing
Copy link
Contributor Author

@akx Now that the performance impact is fixed, I've split off the NaNs fix into a separate pull request.

@Cyberbeing Cyberbeing force-pushed the upscalers branch 2 times, most recently from 26fe815 to 1e8d756 Compare February 1, 2024 11:24
@Cyberbeing
Copy link
Contributor Author

Cyberbeing commented Feb 1, 2024

I've now adjusted the default tile size and allowed step size in the UI based on the model architecture.

If you've applied this pull request before, you'll need to delete ui-config.json for the new UI behavior to apply.

What prompted this, was upon reading the HAT test examples, they contained a comment that Tile Size and Tile Padding must be a multiple of the Window Size, which is 16 in their case. This makes me wonder if input image size is also expected to be a multiple of 16 as well when not tiling, or otherwise quality degrades?

Because of that, the Step Size for Tile Size & Tile Padding has now been changed to either the Window Size, Split Size, or Block Size of each architecture, following the recommendation of HAT. Those last two I'm unsure if they matter or not, but thought I'd play it safe. ESRGAN I was unsure about, since I couldn't find that info, so I set Tile Padding to 8 Step, and reverted Tile Size Step back to the original value of 16. I feel like both should be set to either 8 or 16, but unsure which.

General Tile Default for most Architectures: Changed to 256px Tile, 32px Padding
This is arbitrary, chosen only since it was what HAT used.

SRFormer Tile Default: Changed to 176px Tile, 32px Padding
This is because it has a strange window size of 22 for SRFormer, yet 16 for SRFormerLight in their test config

COMPACT: Changed to Tiling Disabled
Since COMPACT models are small and fast and likely don't need tiles.

This is up for discussion if this make sense or not for the models other than HAT which explicitly states to do this.

@Cyberbeing
Copy link
Contributor Author

For some reason, from modules import sd_models inside upscaler.py causes a startup error:

Traceback (most recent call last):
  File "F:\AI\stable-diffusion-webui\launch.py", line 48, in <module>
    main()
  File "F:\AI\stable-diffusion-webui\launch.py", line 44, in main
    start()
  File "F:\AI\stable-diffusion-webui\modules\launch_utils.py", line 465, in start
    import webui
  File "F:\AI\stable-diffusion-webui\webui.py", line 13, in <module>
    initialize.imports()
  File "F:\AI\stable-diffusion-webui\modules\initialize.py", line 36, in imports
    shared_init.initialize()
  File "F:\AI\stable-diffusion-webui\modules\shared_init.py", line 17, in initialize
    from modules import options, shared_options
  File "F:\AI\stable-diffusion-webui\modules\shared_options.py", line 4, in <module>
    from modules import localization, ui_components, shared_items, shared, interrogate, shared_gradio_themes, util
  File "F:\AI\stable-diffusion-webui\modules\interrogate.py", line 13, in <module>
    from modules import devices, paths, shared, lowvram, modelloader, errors, torch_utils
  File "F:\AI\stable-diffusion-webui\modules\modelloader.py", line 12, in <module>
    from modules.upscaler import Upscaler, UpscalerLanczos, UpscalerNearest, UpscalerNone
  File "F:\AI\stable-diffusion-webui\modules\upscaler.py", line 9, in <module>
    from modules import modelloader, sd_models, shared
  File "F:\AI\stable-diffusion-webui\modules\sd_models.py", line 16, in <module>
    from modules import paths, shared, modelloader, devices, script_callbacks, sd_vae, sd_disable_initialization, errors, hashes, sd_models_config, sd
_unet, sd_models_xl, cache, extra_networks, processing, lowvram, sd_hijack, patches
  File "F:\AI\stable-diffusion-webui\modules\sd_vae.py", line 5, in <module>
    from modules import paths, shared, devices, script_callbacks, sd_models, extra_networks, lowvram, sd_hijack, hashes
  File "F:\AI\stable-diffusion-webui\modules\sd_hijack.py", line 5, in <module>
    from modules import devices, sd_hijack_optimizations, shared, script_callbacks, errors, sd_unet, patches
  File "F:\AI\stable-diffusion-webui\modules\sd_hijack_optimizations.py", line 13, in <module>
    from modules.hypernetworks import hypernetwork
  File "F:\AI\stable-diffusion-webui\modules\hypernetworks\hypernetwork.py", line 13, in <module>
    from modules import devices, sd_models, shared, sd_samplers, hashes, sd_hijack_checkpoint, errors
  File "F:\AI\stable-diffusion-webui\modules\sd_samplers.py", line 60, in <module>
    set_samplers()
  File "F:\AI\stable-diffusion-webui\modules\sd_samplers.py", line 45, in set_samplers
    samplers_hidden = set(shared.opts.hide_samplers)
AttributeError: 'NoneType' object has no attribute 'hide_samplers'

@akx @w-e-w Do either of you have a clue as to the cause?

I didn't have this error on my local branch where I had swapped changed interrogate.py to use the clip_interrogate package, so it must somehow be related to the imports there.

As a temporary measure to allow the new option to be tested in the meantime, I've pushed my local interrogate.py change which isn't really intended as part of this pull request, until the cause of this startup error is figured out.

@akx
Copy link
Collaborator

akx commented Feb 2, 2024

The webui is very finicky about import order, unfortunately. That means shared.opts hasn't been initialized yet - you'll probably need to turn that into a later import (or otherwise untangle the import graph).

@Cyberbeing
Copy link
Contributor Author

@akx Do you have a suggestion on the best location to move the sd_models import? Should I just place from modules import sd_models within the upscale if statement I added or is there a better location? I don't really want to start messing with the imports in other files to resolve this.

        if shared.opts.unload_sd_during_upscale:
            sd_models.unload_model_weights()
            logger.info("Stable Diffusion Model weights are being unloaded from VRAM to RAM prior to upscale")

Still a mystery to me as how importing sd_models at the top of upscaler.py was causing shared.opts to not be loaded for for sd_samplers.py, but I guess it doesn't matter.

@Cyberbeing
Copy link
Contributor Author

Cyberbeing commented Feb 2, 2024

After failing to find a simple find a way to untangle the import mess between what seemed to be sd_models, upscaler, interrogate, shared_options, and modelloader, I ultimately decided to just refrain from importing sd_models at all, and offload/reload the weights manually:

            shared.sd_model.to(devices.cpu)
            devices.torch_gc()
            shared.sd_model.to(shared.device)

Match allowed Tile/Padding size to architecture Window Size (if it exists)
add bfloat16 placeholder
Revert upscaler.py change from 4a66638

Reimplement 4a66638 to only prevent running upscaler
when source image is larger then destination.
@gel-crabs
Copy link
Contributor

gel-crabs commented Feb 6, 2024

A few more things:

I manually re-added the Torchvision replacement and it works fine either way.

Perhaps there should be an option in the UI, or maybe automatic detection for the nearest-exact changes, because when the hires fix is at a proper multiplier it definitely seems to give higher quality images.

SPAN models are about as large as COMPACT models, and run just fine without tiling as well.

@akx
Copy link
Collaborator

akx commented Feb 7, 2024

I ultimately decided to just refrain from importing sd_models at all

You could import it locally just before you call a function from it?

Copy link
Collaborator

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, this looks fine by me – I haven't had the time to work on my new-upscalers stuff for sd-webui in a while so I'd need to rebase & refactor it anyway :)

@Kiteretsu77
Copy link

I wonder if you are interested in deploying our new Anime model from CVPR2024 (https://github.com/Kiteretsu77/APISR).
This one provides 4xGRL (different from regular GRL, we add another conv before it) and 2x RRDB 6B weights (I prefer RRDB because they save much more VRAM than GRL). Thanks!

@dkspwndj
Copy link

I has downloaded 4xHFA2kLUDVAESRFormer_light upscaler. And want to use it T2I hires process.
But if I create a "SRFormer" folder and put it there, it doesn't appear in the menu, and I have to put it in "DAT" or "HAL" for it to be recognized, and a warning saying that it's not the correct folder will pop up and it will be upscaled. Is it okay to just use it?

@akx
Copy link
Collaborator

akx commented Jul 26, 2024

a warning saying that it's not the correct folder will pop up and it will be upscaled. Is it okay to just use it?

That's fine!

It's just a "well, that's strange!" error in the logs – one side effect of us using Spandrel is that you can actually use any Spandrel-supported model, they just might pick up wrong configurations (e.g. a model discovered in the ESRGAN folder would use ESRGAN tile size) but should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants