-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically restart spaces if they're down #2405
Conversation
All the demos for this PR have been deployed at https://huggingface.co/spaces/gradio-pr-deploys/pr-2405-all-demos |
space_id = "gradio/" + demo["dir"] | ||
if not url_ok(f"https://hf.space/embed/{space_id}/+"): | ||
print(f"{space_id} was down, restarting") | ||
upload_demo_to_space(demo_name=demo["dir"], space_id=space_id, hf_token=AUTH_TOKEN, gradio_version=gradio_version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this trigger a restart if this files haven't changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, which is intended. We don't only restart demos when files have changed. We also restart them when the version is changed. This just restarts them if they're down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. My question is, will this be sufficient to restart the Space? In other words, if you push to a Space and the files haven't changed, will it actually restart the Space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh got it @abidlabs, yes it will.
upload_demo_to_space(demo_name=demo["dir"], space_id=space_id, hf_token=AUTH_TOKEN, gradio_version=gradio_version) | ||
if __name__ == "__main__": | ||
if AUTH_TOKEN is not None: | ||
for category in demos_by_category: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this includes all of the demos in recipe_demos.json
but what about demos in other parts of the website like the docs and homepage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still working on that in the other PR.
Script looks good -- couple of clarification questions @aliabd |
for category in demos_by_category: | ||
for demo in category["demos"]: | ||
space_id = "gradio/" + demo["dir"] | ||
if not url_ok(f"https://hf.space/embed/{space_id}/+"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that those URLs will change soon-ish :)
(but when they change, they will be then considered "public API" ie. they won't change again after that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying @julien-c, will make sure to change this when the urls change.
Do you know if there's a way in the api to programmatically check if the space has a build error? Instead of the way I'm doing it now which is checking if the url returns 200
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes there's a way
Stage should be exposed in SpaceInfo
in huggingface_hub
, for instance
This PR adds a script that will check if any of the spaces in the /demos tab are down, and if they are it will restart them. This is important because spaces have gone down recently, especially when we programmatically update many of them at the same time. And often they can be down for a while before we notice.
Once it's merged I will create a cronjob in the server to run it every 15mins.
Also, stopped the website build from failing if the AUTH_TOKEN env variable is missing, so that we can build it locally, like @freddy suggested.