-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The allocation stopped as expected #423
Comments
This error appears in two cases:
I do not necessarily see a reason to look into why one of these cases was given. Rather we should agree on how to handle such cases. I would argue that it is correct that Poseidon does not exit the execution silently but reports that it was aborted. Maybe CodeOcean should handle this case? |
I am not sure whether I get the second reason correctly. When is Nomad completing the allocation if not instructed to do so by CodeOcean? The only reason could be to sync an environment (and hence restart all allocations), but this should not happen that often. Consequentially, the first reason seems to be more plausible. Are there other occurrences, such as when the event stream broke or when the allocation got rescheduled?
Mh, maybe. Nevertheless, I think we should check the overall system for this case. Sure, CodeOcean could request to delete a runner and one could consider this as an error there. But still, why is CodeOcean deleting the runner if it is still used? I think we should investigate overall to understand those occasions. One reason (I haven't verified) would be that a learner uses two tabs to request two executions with one failing (causing the runner to be deleted). Not sure about that, though.
I am currently lacking more information on the occurrences when this error happens to give a specific response to this question, see above. |
Let's quickly check why and when the issue is occurring. Then, we can decide which solution to follow in order to resolve it. |
It seems like the first case for this error is the relevant: CodeOcean is requesting the runner deletion while an execution is running. Side note: In the second and third log the statement |
Previously, the same runner could be used multiple times with different submissions simultaneously. This, however, yielded errors, for example when one submission time oud (causing the running to be deleted) while another submission was still executed. Admin actions, such as the shell, can be still executed regardless of any other code execution. Fixes CODEOCEAN-HG Fixes openHPI/poseidon#423
Previously, the same runner could be used multiple times with different submissions simultaneously. This, however, yielded errors, for example when one submission time oud (causing the running to be deleted) while another submission was still executed. Admin actions, such as the shell, can be still executed regardless of any other code execution. Fixes CODEOCEAN-HG Fixes openHPI/poseidon#423
Thanks for digging deeper into the issue. I created a PR to tackle the issue on CodeOcean: openHPI/codeocean#1982. Feel free to have a look there.
Do you think we should create a ticket for that, in order to prevent this non-critical race condition? |
Previously, the same runner could be used multiple times with different submissions simultaneously. This, however, yielded errors, for example when one submission time oud (causing the running to be deleted) while another submission was still executed. Admin actions, such as the shell, can be still executed regardless of any other code execution. Fixes CODEOCEAN-HG Fixes openHPI/poseidon#423
While I've merged (and deployed) the change in CodeOcean, we also have the question about the non-critical race condition open. Therefore, I am temporarily re-opening this issue. |
Yeah 👍 I've done so with #487 |
Sentry Issue: CODEOCEAN-HG
The text was updated successfully, but these errors were encountered: