Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all the llamas in canary #1803

Closed
wants to merge 4 commits into from
Closed

all the llamas in canary #1803

wants to merge 4 commits into from

Conversation

msaroufim
Copy link
Member

@msaroufim msaroufim commented Aug 2, 2023

All of these OOM on a single device but want to make them all available for the distributed tests @H-Huang is working on

@msaroufim msaroufim changed the title all the llamas all the llamas in canary Aug 2, 2023
Copy link
Contributor

@xuzhao9 xuzhao9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More llamas!

@facebook-github-bot
Copy link
Contributor

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Member

@H-Huang H-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks for adding :)

@@ -0,0 +1,15 @@
from torchbenchmark.tasks import NLP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, these models are in canary_models as opposed to models dir. Does this effect how these models are pulled in either from torchbench or dyanmorunner and if so in what way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message explains it super well, should be no difference

  File "/var/lib/jenkins/workspace/benchmarks/dynamo/torchbench.py", line 302, in load_model
    raise ImportError(f"could not import any of {candidates}")
ImportError: could not import any of ['torchbenchmark.models.stable_diffusion', 'torchbenchmark.canary_models.stable_diffusion', 'torchbenchmark.models.fb.stable_diffusion']
ERROR

@facebook-github-bot
Copy link
Contributor

@msaroufim merged this pull request in e7ca300.

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Aug 3, 2023
Includes stable diffusion, whisper, llama7b and clip

To get this to work I had to Pass in hf auth token to all ci jobs, github does not pass in secrets from parent to child automatically. There's a likelihood HF will rate limit us in case please revert this PR and I'll work on adding a cache next - cc @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @aakhundov @malfet

Something upstream changed in torchbench too where now `hf_Bert` and `hf_Bert_large` are both failing on some dynamic shape looking error which I'm not sure how to debug yet so for now felt a bit gross but added a skip since others are building on top this work @ezyang

`llamav2_7b_16h` cannot pass through accuracy checks cause it OOMs on deepcloning extra inputs this seems to make it not need to show up in expected numbers csv, will figure this when we update the pin with pytorch/benchmark#1803 cc @H-Huang @xuzhao9 @cpuhrsch

Pull Request resolved: #106009
Approved by: https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants