Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative silo limit #1025

Merged
merged 9 commits into from
Feb 13, 2024
Merged

Alternative silo limit #1025

merged 9 commits into from
Feb 13, 2024

Conversation

corneliusroemer
Copy link
Contributor

@corneliusroemer corneliusroemer commented Feb 13, 2024

preview URL: https://alternative-silo-limit.loculus.org

Summary

  • Make silo cronjob pod fail after 3600s (configurable)
  • Only let jobs start within 60s of when it should have started
  • Don't restart failed jobs, it's ok if they fail, there'll be a new one in <1min

resolves #1020 (at least partially)

@theosanderson
Copy link
Member

Sorry I was confused - I thought this was my PR and these were your comments and I didn't know how to do them :)

@corneliusroemer
Copy link
Contributor Author

In my nomenclature the deadline is conf

But same for all instances IIRC

Copy link
Member

@theosanderson theosanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of nice ideas!

@corneliusroemer
Copy link
Contributor Author

Alright, I've made it go back to the roots - but with correct syntax :D

@corneliusroemer corneliusroemer changed the base branch from limit-silo-import to main February 13, 2024 18:10
@corneliusroemer
Copy link
Contributor Author

Changed base to main after @theosanderson's closing of #1022

Copy link
Member

@theosanderson theosanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good. Could be worth doing a bit of testing over repeated cycles, which I haven't. But code LGTM. Thanks for fixing up my syntax :)

@theosanderson
Copy link
Member

And we can consider per-organism limits for post-MVP - it wouldn't be hard (but I also can't face doing it myself this sec :) )

@corneliusroemer
Copy link
Contributor Author

Code looks good. Could be worth doing a bit of testing over repeated cycles, which I haven't. But code LGTM. Thanks for fixing up my syntax :)

I haven't found a way to get yaml schema to work with kubernetes, would be great help for vscode.

I'll notice bugs on main - I mean things were broken for 18h and this can't be worse.

@corneliusroemer corneliusroemer merged commit fae0c63 into main Feb 13, 2024
9 checks passed
@corneliusroemer corneliusroemer deleted the alternative-silo-limit branch February 13, 2024 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Triggers a deployment to argocd
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SILO import cronjob got stuck for 18h at "initializing" stage
2 participants