-
-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
😵💫 Face Models Comparison and Suggestions #195
Comments
which face descriptor you used? |
I tried a few... we could run an average maybe? dlib, MTCNN, RetinaFace are decent and pretty fast. Insighface seems to be biased since you trained with that. |
the metric is 1-cos similarity”? |
yes, facenet. Again, I've tried a few options but the result seems more or less the same. FaceID Plus v2 at weight=2 is always at the top. Interestingly FaceIDPlus and a second pass with PlusFace or FullFace is also very effective. That makes me think that there are more combinations that we haven't explored. You seem very interested, I'm glad about that. Please feel free to share your experience/ideas if you want. |
yes, i am very interested, because a good metric is important to develop a good model. you are right, you can also try FaceID + FaceID Plus thresholds = { |
is that the minimum threshold? You set it very high. Almost only FaceID alone performs that low. At least in my testing |
by the way, do you have some ideas or suggestions on improving the result, which maybe helpful to me. |
yes, from deepface repo in fact, I found face ID embedding is very powerful, i think I should find better training tricks l. |
I have tried Also tried FaceID Plus v2 at weight=2.5, some checkpoints react well to it but in general it's not a big difference. |
what do you think of this https://twitter.com/multimodalart/status/1742575121057841468 (multi image) |
I've seen people send multiple images trying to increase the likeliness. I'm not convinced it actually works, there's a lot of bias in "face" recognition. I will run some tests, honestly I think it's laziness. I was able to reach 0.27 likeliness with a good combination of IPAdapter models at low resolution. Combining 2 IPAdapter models I think it's more effective than sending multiple images to the same model. But I'll make some tests. PS: looking forward to the SDXL model! |
@xiaohu2015 do you already have the code for SDXL? So I can update it and we are ready at launch 😄 |
it same as SD 1.5 FaceID: face embedding + LoRA but I am not sure if SDXL version really better than the SD 1.5 version, because evaluation metrics are often unreliable |
okay I ran more tests, any combination of Plusv2 with any other model is definitely a winner. These are all good:
The only other NOT v2 combination that seems to be working well is FaceIDPlus+FaceID. I'll update the first post when I have more data PS: I got a 0.26 today at low resolution! Looking forward to do some high resolution test 😄 |
I will update SDXL model now, you can also test it |
@cubiq update at https://huggingface.co/h94/IP-Adapter-FaceID#ip-adapter-faceid-sdxl but you should convert the lora part |
great thanks! I just updated the first post with new info. Data for round 2 is here: https://docs.google.com/spreadsheets/d/1Mi2Pu9T3Hqz3Liq9Fdgs953fOD1f0mieBWUI6AN-kok/edit?usp=sharing I'll check SDXL later 😄 and run dedicated tests on it too. |
I just had a look at the key structure of the SDXL lora and it's a darn mess 😄 do you have a conversion mapping @xiaohu2015 ? |
I think we can refer to this. You can find a normal sdxl lora weight and load it, print its keys, then you can get In the future version, lora should be not needed |
the structure is pretty different and I couldn't find a relationship at first sight. But I'll check better later. I'm a bit busy this week, I might be able to work on it next Monday.
On SDXL
So it looks a little more complicated than that 😄 |
@laksjdjf can you help |
ok, I will also upload a lora weight next week |
It is really great work! |
@xiaohu2015 that's very interesting I tried with IPAdapter, it kinda works. Only SDXL though, any idea if it can be applied to SD15 too? |
Hey @xiaohu2015 when will you release the updated IPAdapter Plus Face model you mentioned earlier? |
It feels the same as model merging. I've experimented a lot with a modified version of ipadapter that has the ability to change the weight for each layer. My finding is that around 1-6 affect composition, 6 to 8 has the most impact on the face or subject, 8-10 or so affect skin and the remaining up affect detail. Perhaps more in a gradient. The 7th layer has the highest impact. Same goes with the the lora weights. SDXL contain a lot more layers, but it's kinda similar. If you reduce the range to be the same as sd15, (meaning 1 slider affect multiple layers) it's more or less the same. |
SDXL portrait? |
yes, it same as sd 15 https://huggingface.co/h94/IP-Adapter-FaceID/blob/main/ip-adapter-faceid-portrait_sdxl.bin. work better with 5+ face images. no lora. for text style, maybe lower the weight. |
|
it works very well, I'll make a comparison with InstantID maybe. For the style you really need to lower the weight though. It depends a lot on the checkpoint and the kind of style you want |
it should be not so good as instantid, but more light |
please don't use this thread for chit-chat, open another if you want |
@cubiq Sorry about that. I didn't mean to deviate from the conversation. But you also deleted my first post, with a question about FaceID Will it be possible to use embeds for FaceID ? I can use a batch, but I'd like to set the weight for each image, like we do with embeds. |
yes, it's in my to-do |
But currently there is no way to add a bunch of images to process them through portrait sdxl in comfyui? May it be implemented? Preferably up to 5 images. |
you can send as many images as you want with an image batch node |
@cubiq very amazing work! https://www.youtube.com/watch?v=b6TbdBJBI4Q&t=2s |
Is this the [ weight_type ] option on the new node ? |
I'm having great results with Juggernaut XL 9 lightning at 6 to 8 steps. This is the same settings as the post above: But now with higher resolution ( 896x1344 and 768x1152 instead of 512x768 ) |
Here is the same three way comparison of FaceID + InstantID, with Juggernaut XL 9 lightning, at high resolution. |
@JorgeR81 hi, mind sharing your workflow .json, if possible? |
I've made a simplified version of the workflow, without custom nodes from other suites. Let me know if it's working correctly. It still has ControlNet Preprocessor nodes, but you can delete them, if you don't want to install them. This is set for SDXL Lightning. You can bypass FaceID, InstantID or both ( see workflow image ). If you want to use InstantID, you need to use an SDXL checkpoint. The workflow is also in the preview image. |
I've also made a version of this workflow without InstantID, and with IPAdapter embeds, after the FaceID nodes. The workflow is also in the preview image. |
After this fix, I updated the workflows I just shared here. |
I'm closing this one and moving the discussion here |
Face Models Comparison
I started collecting data about all the face models available for IPAdapter. I'm generating thousands of images and comparing them with a face descriptor model. The result is subtracted to the original reference image. A value of
0
means 100% same person,1.0
completely different.BIAS! Important: please read!
The comparison is meant just as an overall help in choosing the right models. They are just numbers, they do not represent the actual image quality let alone the artistic value.
The face descriptor can be skewed by many factors and a face that is actually very good could get a low score for a number of reasons (head position, a weird shadow, ...). Don't take the following data as gospel, you still need to experiment.
Additionally the images are generated over a single pass of 30 steps. Better results could be probably achieved with a second pass and upscaling, but that would require a lot more time.
I think this data still has value to at least remove the worst offenders from your tests.
Round 1: skim the data
First step is to find the best performing checkpoints and IPAdapter face models (and face models combination). With that established we can move to the second phase which is running even more data concentrated on the best performers.
These are all the IPAdapter models that I've tested in random order, best performers are bold and will go to the next round.
These are the Checkpoints in random order, best performers are 🏆 bold.
Dreamshaper will be excluded from photo-realistic models but I will run it again with other "illustration" style checkpoints.
The preliminary data is available in a google sheet: https://docs.google.com/spreadsheets/d/1NhOBZbSPmtBY9p52PRFsSYj76XDDc65QjcRIhb8vfIE/edit?usp=sharing
Round 2: Refining the data
In this phase I took the best performers from the previous round and ran more tests. Best results bold
Basically more embeds, better results.
realisticVisionV51_v51VAE (NOT V6) Is overall the best performer but life like diffusion has often the single best result; meaning that the average is not as good as realistic vision, but sometimes you get that one result that is really good.
I tested both euclidean and 1-cosine and the result are surprisingly the same.
Since it seems that more embeddings give better results I'll also try to send multiple images of the same person to each model. I don't think it will help, but happy to be proven wrong.
The data for round 2 can be found here: https://docs.google.com/spreadsheets/d/1Mi2Pu9T3Hqz3Liq9Fdgs953fOD1f0mieBWUI6AN-kok/edit?usp=sharing
Preliminary SDXL
Combinations tested:
A the moment the best models seem to be:
Predictably V2+PlusFace again are the best performers. The best average is still .36.
Interestingly TurboVision XL performs very well.
Data: https://docs.google.com/spreadsheets/d/1hjiGB-QnKRYXTS6zTAuacRUfYUodUAdL6vZWTG4HZyc/edit?usp=sharing
Round 3: Testing multiple reference images
Processing...
Round 4: Higher resolution
Upscaling SD1.5 512×512 images is not advisable if you want to keep the likeliness as high as possible. Even using low denoise and high IPAdapter weight the base checkpoints are simply not good enough to keep the resemblance.
In my tests I lose about .5 likeliness after every upscale.
Fortunately you can still upscale SD1.5 models with SDXL FaceID + PlusFace (I used Juggernaut which is the best performer in the SDXL round). The results are very good. LifeLifeDiffusion and RealisticVision5 are still the best performers.
The average is still around 0.35 (which is lower than I'd like) but sometimes you get very good results (0.27), so it's worth running a few seeds and try with different reference images.
Result data here: https://docs.google.com/spreadsheets/d/1uVWJOcDxaEjRks-Lz0DE9A3DCCFX2qsvdpKi3bCSE2c/edit?usp=sharing
Methodology
I tried many libraries for feature extraction/face detection. In the aggregated results I find that the difference is relatively small, so at the moment I'm using Dlib and euclidean similarity. I'm trying to keep the generated images as close as possible in color/position/contrast to the original to have minimal skew.
I tried 1-consine and the results don't differ much from what is presented here so I take that the data is pretty strong. I will keep testing and update if there are any noticeable differences.
All primary embedding weights are set at .8, all secondary weights are set at .4.
The text was updated successfully, but these errors were encountered: