-
Notifications
You must be signed in to change notification settings - Fork 11
fast stable diffusion wiki
Make sure you're using the latest notebook! https://colab.research.google.com/github/TheLastBen/fast-stable-diffusion/blob/main/fast-DreamBooth.ipynb
First things first, use the latest notebook from the link above :) Once you have loaded up the notebook you will be greeted with this colab screen
At this point you can either stay on this always updated colab page or make a copy to your gdrive by going to "file" then "save a copy in drive" like so.
Now we have the notebook ready for use, to start we have to mount to gdrive, to do so click the play button on the first cell.
Once it connects to the runtime you will be asked to sign in with your Google account, after signing in you will be returned to the notebook where we can start running our next cells. After gdrive mounting you will run the Dependencies cell, next will be the model download make sure you select the model you want to train with! There are many options to play with here as you can load your own models from gdrive/huggingface. Also if you're loading the 1.5 model for the first time you will have to accept the terms and conditions from here: "https://huggingface.co/runwayml/stable-diffusion-v1-5" You will also need a huggingface token to get this log in to huggingface and go to Setting/Access tokens and create a new read token.
Time to create a session, the session name can be anything you want as you only use it to load previously trained models to retrain/use I usually just name it the same as my instance token.
Now this part is very important for training we need instance_images
these are the images of the subject/style you wish to train, as sd 1.5 is trained at 512x512 highly recommended cropping your images down to this size for training to avoid any unexpected results, an easy method of doing this is using a website like
https://www.birme.net
once your images are resized you will need to rename them all to the same unique identifier this will be how you call upon your style/subject in stable diffusion after the training , an easy way to rename all your files on Windows is to select all the ones you need to rename then hit F2 and rename to whatever you want the token to be this should rename all files to token (1) token (2) and so on. For me I will be using shrk for shrek.
Now we should be ready to upload our instance images to do so just hit play on the instance images cell and at the bottom you should see options to upload or cancel the upload.
Once completed your next option is to upload concept images these images will be used as heavy regularisation and should be similar to what you are training on to help the model create images of your trained subject in many different poses or backgrounds that aren't in the instance images essentially helping to not overfit your model.
LET THE TRAINING BEGIN
This is where the magic happens first off I recommend just reading through what Ben has written for the explanation of each section.
Starting at the Unet
I usually set this low to start with maybe 1000-1500 it's quite hard to overtrain the unet so starting low is good you can always come back and retrain it more for your desired result.
The text_encoder
is really finicky and can be frustrating to mess around with if you overtrain on your first run you'll have to retrain the model all together Ben's 350 default is a really good starting point (for a style maybe try starting at 150 and test the model to see if it needs a little more training)
The concept_text_encoder
is a recent addition to the notebook again read Ben's brief explanation about it and just test from there if you have any findings to share start a discussion thread :)
Once all your settings are in hit the play button and wait for the training to complete.
Once completed it will convert into a ckpt that can be found in your gdrive under fast-dreambooth/Sessions/shrk
but we don't need to go there.
In the next cell "Test the trained model" you have the option to load a previous session by naming it or load a custom ckpt from a gdrive directory but if you just finished a training session it will default load that newly trained model so no need to mess with those settings.
With the use_gradio_server
option you will see two URL I usually use the public share link through this server when this setting is unticked it will use local tunnel to connect which will only give one URL.
At this point you're pretty much there you should have your trained model loaded into the automatic1111 webui now it's time to get testing, try just asking it to generate your token at first.
This is a render just by typing shrk (note that my training images are trash and i only used 5 of them)
If the results look decent try adding to the prompt and see if your style or person/object is still coming through. ghost of the shrk by Anna Dittmann, digital art, horror, trending on artstation, anime arts, featured on Pixiv, HD, 8K, highly detailed
Congratulations you've successfully trained a model 👏 from here there are many things you can do, If you feel like the details of your subject aren't coming through as much as you'd want then resume training of the unet for 1000 steps and test again if you are using your token but don't see shrk anywhere then resume training of the text encoder for another 100 steps and test again.
All of this isn't set in stone this tech is quite new and everyone is trying to find the best optimal settings so just keep testing and creating models, just have fun with it!
Moduel errors:
From what I've seen most of the time this is an issue if you have an old sd folder by running an older colab best way to fix this is deleting the sd folder in colab and rerunning auto1111
^C errors:
This is an out of memory error if you're uploading a big model (7gb or more) just tick the option large_model
if you get this error elsewhere try changing to a high RAM runtime.
Connection errors:
Seems to be a bit of a fight between whatever server is working best at the time, if gradio server isn't working as desired untick the box and run through the tunnel server link and vice versa