-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
debugpy sometimes fails to start up #1064
Comments
It's interesting that the adapter doesn't report any error from sending the port information to the server. Nor did it get blocked - there wouldn't be the "all debug servers disconnected" message in the log if that were the case. So it did successfully connect to something on that port, and sent a TCP packet there. This seems to imply that something else hijacked that port. The server tries to prevent port clashes by using an ephemeral port and letting the OS pick the port number: debugpy/src/debugpy/server/api.py Line 158 in 01b1c7b
(On Windows, we also set SO_EXCLUSIVEADDRUSE , but there's no equivalent on other platforms that I know of.)
However, this would require for the hijack to happen between the listener socket being opened, and the adapter process spawned and getting to the point in its startup where it tries to connect to said port. Which is a pretty short interval, so if that's the case, I wouldn't expect it to happen often. I'm not well versed in the Linux networking stack, but perhaps there's some system-wide way to log which processes are opening ports for listening? |
I picked up this failure on a CI machine, which may be busy with random processes taking different ports. But saw it more than once, and if it happened there, can happen somewhere else too. Is there anything else I can do on my end to help? |
I think part of the problem is that we use However, this logic doesn't apply to ephemeral ports that we use for internal communication - yet we use the same helper function to create listener sockets for them. I did some more digging on the exact effect of this combination, and it turns out that I'm not 100% sure that this is the root cause, but it seems the most likely candidate, especially if you run tests that involve debugpy concurrently. So let's fix that first and see if it helps. |
Tests are run one at a time, but I'm not sure there's a guarantee that the previous test will be done shutting down before the next one starts. Thanks for looking into this! |
The tentative fix is merged now; let me know if this helped. |
I think that did it- a few hundred runs in CI and I haven't seen this again yet. Thanks! |
Before creating a new issue, please check the FAQ to see if your question is answered there.
Environment data
Continuing my investigation of flakiness in VS Code's run-by-line, here's an issue that I've seen a few times in CI (https://github.com/microsoft/vscode-jupyter/actions/runs/3102518942/jobs/5024891548)
From VS Code's side, we try to start debugging, and send an "initialize" request.
The debugpy logs show that the server spawns the adapter, then waits for it to call back on some port. The adapter starts, tries to call back, but it seems that the server does not get that request. At least this is my interpretation of what I'm seeing. I'm interested in any ideas you have about what could be going on. Maybe it's a case of trying to use a port that turns out to not be free (I filed an issue related to that in the ipykernel repo)
Does_not_stop_in_other_cell_-_18-debugpy.adapter-10546.log
Does_not_stop_in_other_cell_-_18-debugpy.pydevd.10530.log
Does_not_stop_in_other_cell_-_18-debugpy.server-10530.log
The text was updated successfully, but these errors were encountered: