-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove subset inputs #6
Conversation
🎉 New recipe runs created for the following recipes at sha
|
/run recipe-rest recipe_run_id=353 |
Misspelled the last command! |
/run recipe-test recipe_run_id=353 |
✨ A test of your recipe I'll notify you with a comment on this thread when this test is complete. (This could be a little while...) In the meantime, you can follow the logs for this recipe run at https://pangeo-forge.org/dashboard/recipe-run/353 |
Pangeo Forge Cloud told me that our test of your recipe To see what error caused the failure, please review the logs at https://pangeo-forge.org/dashboard/recipe-run/353 If you haven't yet tried pruning and running your recipe locally, I suggest trying that now. Please report back on the results of your local testing in a new comment below, and a Pangeo Forge maintainer will help you with next steps! |
Hmm the linked logs are not helpful without a solution for pangeo-forge/pangeo-forge.org#63 Pulling logs from the backend directly, I'm seeing this rather opaque error
@paigem, does the version of the recipe in this PR execute locally for you without error? |
Yes, @cisaacstern this recipe runs without errors locally. Side note: the warning about 255 character length issues gets printed out many, many times in the local run, and so it's a bit difficult to sift through and see if there are any other warnings that show up. But, it looks like in this case, in addition to no errors, there are no other warnings either. |
🎉 New recipe runs created for the following recipes at sha
|
/run recipe-test recipe_run_id=396 |
When I tried to import your recipe module, I encountered this error
Please correct your recipe module so that it's importable. |
🎉 New recipe runs created for the following recipes at sha
|
/run recipe-test recipe_run_id=397 |
✨ A test of your recipe I'll notify you with a comment on this thread when this test is complete. (This could be a little while...) In the meantime, you can follow the logs for this recipe run at https://pangeo-forge.org/dashboard/recipe-run/397 |
Pangeo Forge Cloud told me that our test of your recipe To see what error caused the failure, please review the logs at https://pangeo-forge.org/dashboard/recipe-run/397 If you haven't yet tried pruning and running your recipe locally, I suggest trying that now. Please report back on the results of your local testing in a new comment below, and a Pangeo Forge maintainer will help you with next steps! |
🎉 New recipe runs created for the following recipes at sha
|
Two reflections:
|
/run recipe-test recipe_run_id=399 |
✨ A test of your recipe I'll notify you with a comment on this thread when this test is complete. (This could be a little while...) In the meantime, you can follow the logs for this recipe run at https://pangeo-forge.org/dashboard/recipe-run/399 |
Pangeo Forge Cloud told me that our test of your recipe To see what error caused the failure, please review the logs at https://pangeo-forge.org/dashboard/recipe-run/399 If you haven't yet tried pruning and running your recipe locally, I suggest trying that now. Please report back on the results of your local testing in a new comment below, and a Pangeo Forge maintainer will help you with next steps! |
I've just increased worker memory in hopes that this may allow us to move past the KilledWorker issues we've seen. I'm going to try to re-run this recipe from the existing recipe run now. |
/run recipe-test recipe_run_id=399 |
✨ A test of your recipe I'll notify you with a comment on this thread when this test is complete. (This could be a little while...) In the meantime, you can follow the logs for this recipe run at https://pangeo-forge.org/dashboard/recipe-run/399 |
This hasn't officially failed yet, but it does appear to be stalled and likely to time out. Because we have a lot more memory available to us now, I'm going to remove the subset inputs again to make this more intuitive to debug. |
🎉 New recipe runs created for the following recipes at sha
|
I'm going to run |
/run recipe-test recipe_run_id=501 |
✨ A test of your recipe I'll notify you with a comment on this thread when this test is complete. (This could be a little while...) In the meantime, you can follow the logs for this recipe run at https://pangeo-forge.org/dashboard/recipe-run/501 |
I just manually cancelled There's a lot of noise in the full traceback, but if you look closely, it appears to have been stalled on acquiring a lock Note these lines present in the full trace
Traceback
I'm going to try restoring the target_chunks to original This is the setting we had initially run with, and which caused a KilledWorker in #5, but we have 3x the worker memory now, so perhaps all will work smoothly. |
🎉 New recipe runs created for the following recipes at sha
|
/run recipe-test recipe_run_id=502 |
✨ A test of your recipe I'll notify you with a comment on this thread when this test is complete. (This could be a little while...) In the meantime, you can follow the logs for this recipe run at https://pangeo-forge.org/dashboard/recipe-run/502 |
Ok so here was what ultimately happened, to make this work:
Going to rename this PR accordingly, for clarity, then merge, which will trigger the next production build. |
Charles thanks so much for your perseverance here! 🙏 |
Amazing!! We have a successful test! Thank you @cisaacstern!! |
KilledWorker on the production run 😵💫 https://pangeo-forge.org/dashboard/recipe-run/503 I will look into this more closely and get back with some ideas. Thanks for your patience, Paige. |
Oh no!! So sorry about this @cisaacstern. Thanks for continuing to push this through! For now, having the test data might go pretty far in the short term, so that's at least a good step! |
Create 30-day time chunks (13 total, 12 x 30-day chunks and 1 x 5-day chunk) in attempt to solve #5
@cisaacstern could you merge this PR? Thanks for your help with this!