Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lora Tags and Comfy UI, and breaking random tags #130

Closed
LargoUsagi opened this issue Oct 19, 2023 · 9 comments
Closed

Lora Tags and Comfy UI, and breaking random tags #130

LargoUsagi opened this issue Oct 19, 2023 · 9 comments

Comments

@LargoUsagi
Copy link

I did find something with the random tags and the current tag processing. And with a bit of rest and setting up an IDE I have more details to start from this time :)

If you place a tag inside of the random tag it gets stripped out
image

<random: young, adult, adult, old, mature, ancient> <random: male, female> <random: dwarf  <lora:RPGDwarfXL:0.5>, half-orc>  epic lighting, highly detailed, bloom effect, epic pose

Or in this case, breaks the tag processing entirely
image

<random: young, adult, adult, old, mature, ancient> <random: male, female> <random: dwarf <lora:RPGDwarfXL:0.5>, half-orc> <random: fighter, wizard, sorcerer, alchemist, villager, ranger, hunter, druid, warrior, witch, artificer, rouge, paladin, barbarian, monk, cleric, warlock, bard> <random: smiling, grinning, frowning, staring, happy, stressed, surprise, contempt, > standing in <random: medieval market, castle, town square, alley, meadow, forest, jungle, tundra, battlefield, plains, mountains, desert, bluffs, cliffs, marsh, swamp, bog, dungeon, bar, tavern, inn> during the <random: morning, day, evening, night> holding <random: weapon, nothing, bag, nothing, dagger, drink>, epic lighting, highly detailed, bloom effect, epic pose

Stepping through the code when it calls out to this method
https://github.com/FreneticLLC/FreneticUtilities/blob/master/FreneticUtilities/FreneticToolkit/StringConversionHelper.cs#L372
here
https://github.com/Stability-AI/StableSwarmUI/blob/master/src/Text2Image/T2IParamInput.cs#L241C39-L241C39

It is stripping out the lora tag entirely from the random tag and replacing it with an empty string.
image

It then shows up in the metadata as a lora model that was loaded but the lora text is not passed to the comfyui prompt so it doesn't actually trigger the use of the lora.

From the generate page
image

From Comfy
image

@mcmonkey4eva
Copy link
Contributor

the lora tag is intentionally stripped from the prompt (Comfy doesn't know how to parse that tag, swarm parses it - by removing it from the prompt and adding it to the actual lora list, which passes to comfy as a LoraLoader node)

If I'm tracking you correctly here, it looks like the issue is that the lora is pulled out from the prompt before the random is processed, ie you want it to only sometimes use the lora but instead it's always using it?

@mcmonkey4eva
Copy link
Contributor

with the above commit (and an upstream rework), <random: x, y <lora:z> will now first do a random of either x or y <lora:z>, and then if and only if the second option is picked, will parse <lora:z> and apply the lora, allowing for randomized optional selection of loras as desired.

Let me know if there's any cases that still need to be handled better.

@LargoUsagi
Copy link
Author

I will update and test this later tonight, but in my experience the comfyui prompt does in fact handle the lora tag any where in the prompts, at least with SDXL and ClipTextEncodeSDXL node. I had a been testing loading a large chain of lora's at a weight of 0 and then using that tag to trigger them when certain concepts needed some additional assistance.

The part about being passed as a lora loader node in comfy I am not following, I don't see something to add to node network that would have those lora's from the prompt. That would be a fine solution but I am missing something.

@mcmonkey4eva
Copy link
Contributor

mcmonkey4eva commented Oct 19, 2023

Comfy applies loras via the LoraLoader (aka Load LoRA) node between the model and ksampler -- it cannot load just from prompt text.

If you're seeing anything that indicates otherwise in testing, my first recommendation is to pick a different lora, one that is more likely to produce an extremely clear visual distinction (SDXL knows what a dwarf is), say eg something trained on a specific character that the base model can't replicate well.
Maybe even do a direct same-seed side-by-side within the exact same prompt text with vs without the LoraLoader node added to see what difference that node in particular makes.

Swarm's usage of the Lora list in input is within the workflow generator (ie it is incompatible with custom workflows atm) wherein it will simply emit LoraLoader nodes for every lora you have selected.

@LargoUsagi
Copy link
Author

I had a misunderstanding of how that tag worked inside of comfy and it does add the lora with the given weight to the model even if it doesn't have it in the graph. Removing the lora prompt from the input on both sides I do get the same output with the same prompt and seed. If I add it back in via the LoadLora or the prompt box in comfyui the lora is applied, if I add it back in from the generate tab it doesn't matter what lora's are referenced it has no change on the output of the workflow.

The issue is when I add it back in via the prompt in generate I am not seeing that lora take any effect, I can add extreme ones that will change the entire style of the output and nothing is happening.

Here is the workflow if that is of any help.
workflow.json

@mcmonkey4eva
Copy link
Contributor

mcmonkey4eva commented Oct 19, 2023

That's... a rather complex workflow, with at least 3 loras statically pre-applied it seems? Try a simpler one?

Here's a bare bones minimum side-by-side one (apologies for spaghetti with the nodes, slapped together quickly to show the concept):
image

I used an "arcane jinx" character lora as the base model entirely lacks the arcane style and only half-understands the character, making it very obvious in the top one that the simple usage of <lora:characters/sdxl/arcane_jinx_sdxl.safetensors:1> does not actually cause the lora to load at all.
(Top 2 both have identical prompt, top no lora, middle with lora, bottom has the lora loaded but the <lora:> text removed for comparison)
(Note that changing the path format, eg using \ or removing the folders entirely, doesn't change the result, only slightly different images as the text encoder is reading the lora text as part of the raw text prompt)


And, to be clear, loras in prompt aren't working for you because you're using a custom workflow, ie the lora parameter in swarm is effectively disabled and doing nothing, as that's not a supported case yet

@LargoUsagi
Copy link
Author

Yeah, I replicated this, although the base model understands what a dwarf is, a lot of times it will miss on asking for a female dwarf, and that lora solved for that. I think that in the workflow the lora tag may have just been enough random tokens to trip some of the seeds I was on from producing a male dwarf or a dwarf that looked more like an elf and was just from a poor sample set.

It would be nice to be able to dynamically load the loras from the prompt but after digging into the ComfyUIAPIAbstractBackend it looks like you are going to need to create a new custom node that can chain a bunch of loras from the T2IParamTypes.Loras/T2IParamTypes.LoraWeights and inject them in a similar way to the prompt. And then it would get more complicated if you wanted to inject lora's at prompt time into multiple different nodes in the workflow.

I wrote some code to modify the prompt behavior and inject it to generate a bunch of outputs all at once 😄

              var loras = user_input.Get(T2IParamTypes.Loras);
              var loraWeight = user_input.Get(T2IParamTypes.LoraWeights);
              
              String loraString = "";
              if (loras.Count > 0)
              {
                  StringBuilder rebuildLoras = new StringBuilder();
                  for (int i = 0;  loras.Count > i; i++)
                  {
                      rebuildLoras.Append($"<lora:{loras[i]}:{loraWeight[i]}>");
                  }

                  loraString = rebuildLoras.ToString();
              }

              var prompt = $"{user_input.Get(T2IParamTypes.Prompt)}{loraString}";

So that was a fun little experiment to drive home how things actually worked and to get to know the code base a little bit better.

The work here is genuinely great and I cant wait to see how this evolves as it moves forward.

@mcmonkey4eva
Copy link
Contributor

mcmonkey4eva commented Oct 20, 2023

Actually! After thinking about it, can solve this by just giving a cheat node for custom workflows to use:

SwarmLoraLoader, hook it up right after your model loader like so:

image

It's basically a python self-contained multi-lora-loader designed with inputs matched to a format Swarm can send em in, so if you have it present and load your workflow in swarm, it will automatically detect & use it.

(If you want multiple usages, you can also have primitives with SwarmUI: Loras and SwarmUI: Lora Weights and hook them up to the names&weights inputs of those nodes)

Thus, any swarm prompt that adds loras to the standard params will automatically send the loras to this node and it'll work as expected

@LargoUsagi
Copy link
Author

LargoUsagi commented Oct 20, 2023

Man, you're fast. I started digging into how comfy was coded and looking at writing a module to effectively do that. After digging through your code and seeing how the tags are being parsed, I was going to add to the tags. I was trying to use a third parameter for the name of the target node and use the code you already had to inject data into there.

IE <Lora:name:weight:nodename> where nodename would be optional so if you have only one point to inject the relevant information it wouldn't matter. I was just starting to write the logic as find the first one and use it. Otherwise work through all of the nodes and use the one of the matching name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants