-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
service/dap: support pause request #2466
Conversation
fc99f10
to
2036797
Compare
I am having all sorts of issues with the test. Everything works locally, but as always TeamCity brings some surprises. The test uses loopprog.go, which doesn't do anything fancy. Yet have seen the following failures. Could they be related to halting? E.g. I am not checking state and issue a halt command request even if things are not running. Should I be?
|
The test has no reason to be more complicated than starting loopprog, waiting for a random timeout and sending a single pause request. |
I figured that this would be the price of a non-flaky test. However, I wanted to keep it as-is for now so we can have a discussion as to why there have been all the issues that I have observed. I experimented with it quite a bit - adding (e.g. 2036797) and removing random time-outs and such and it did not make things better. I do think that the issue is the repeated continue/pause sequence and also calling pause when already at a stop. I had the loop because I wanted to both get the case when the main goroutine pauses and when there is no selected goroutine. I can take it out without reducing test coverage. But I would not want to take out the case of pausing while already in paused stated. Shouldn't that be working? |
If you want to test something about the proc package you should write a test for the proc package, there's no reason to make things more complicated by going through a rpc layer. |
My goal was not to test the proc package. I wanted to see how the dap layer handled different outcomes of halt. Does it handle halt errors? Does it handle cases where nothing is running and so nothing is interrupted and no stopped event is sent? What do things look like when there is no selected goroutine when we stop? In doing so I came across proc behavior that surprised me, so I wanted to learn more about it, to understand better the inner-workings of the codebase and the runtime I am working with. In particular, should it be valid to call command halt when the debugger state is not running? Or should the client or the dap server detect that nothing is running and not call it? This would not only impact pause, but also disconnect. |
Got it. But none of this is being tested here, this is a test about what happens if pause and resume are used in rapid succession.
It should be but I don't think anyone has looked into it closely and it isn't something that's being tested in the wild either. |
Some of the test iterations that failed on TeamCity for me were sleeping between the successions, so they were not so rapid, but it did not help. Maybe the value I choose was still too small. I will try something more comparable to what a human would do. I just found the rapidly printing numbers polluting the output log very annoying. Any objections to changing this fixture to printing less often?
It is used in the wild because both legacy vscode-go and dlv dap issue a halt as part of disconnect, without checking if the process is running. You can also run into double continue or double pause scenarios with multiple parallel clients or experimental and not fully fleshed out features like call, where the UI is not always correctly reflecting the running/stopped state, exposing the wrong controls. Obviously, that's not a big concern in reality. |
The way manual stops work is pretty simple, we just send a SIGTRAP signal to the process, simulating what a software breakpoint would do. I don't know what the kernel does with it if the thread happens to be already in a trace-stopped state, I've never checked. I've also never checked what happens if the thread happens to have just exited trace-stop but hasn't been scheduled yet. It might be that something strange happens, it might be that there is a bug in our code. The behavior is also very likely to be different between different OSes. I'd be interested in looking into this more but I don't think this should be part of the test for this PR, because it's a problem in a different component, because it's unlikely to be fixed quickly given that it seems pretty low impact and because it's going to be easier to diagnose with fewer layers in between.
If that's a concern I guess it's easier to fix now that you can call IsRunning, but if this was a common problem I think we would have heard something from users. |
Likely so. Although if things are disconnecting and fail in the process, one might not even notice because either way all is gone in a fraction of a second. But yes, it's likely not a big deal because clearly in the common case where it matters, it hasn't bothered users yet. |
I don't think this and the detach thing are related. I think the way we detach while killing the process has never been that great and sometimes it returns an error for stupid reasons. |
Fair enough. I will rework the test, but I think you can look at the implementation already. Looking forward to your review and thank you for the discussion. |
…go-delve#2435) The client can specify certain configurations in the initialize request. For example, pathFormat determines the pathFormat. We do not currently support configuring pathFormat, linesStartAt1, or columnsStartAt1, so we report an error if the client attempts to set these to an unsupported value.
Bintray is shutting down and the URL we used to install mingw is no longer available. Use chocolatey instead.
…ve#2301) Adds the low-level support for watchpoints (aka data breakpoints) to the native linux/amd64 backend. Does not add user interface or functioning support for watchpoints on stack variables. Updates go-delve#279
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pause test is now passing.
TestAttachStopOnEntry is failing. It has not been impacted by the pause change, but I have experienced it flake every once in a while in an unexpected way. I can adjust the test in a separate PR since it's not related.
2021-05-07T07:09:57Z debug layer=debugger continuing
2021-05-07T07:09:57Z debug layer=dap [<- from client]{"seq":13,"type":"request","command":"disconnect","arguments":{"terminateDebuggee":true}}
2021-05-07T07:09:57Z debug layer=debugger halting
2021-05-07T07:09:57Z debug layer=dap "continue" command stopped - reason "manual", location /usr/local/go/go1.16.4/src/runtime/sys_linux_amd64.s:159
2021-05-07T07:09:57Z debug layer=dap [-> to client]{"seq":0,"type":"event","event":"stopped","body":{"reason":"pause","threadId":1,"allThreadsStopped":true}}
2021-05-07T07:09:57Z debug layer=dap [-> to client]{"seq":0,"type":"event","event":"output","body":{"category":"console","output":"Detaching and terminating target process\n","source":{}}}
2021-05-07T07:09:57Z debug layer=debugger detaching
2021-05-07T07:09:57Z error layer=dap no such process
2021-05-07T07:09:57Z debug layer=dap Error while disconnecting: no such process
service/dap/server_test.go
Outdated
|
||
// Pause will be a no-op at a breakpoint: there will be no additional stopped events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played with this some more using personal builds. This pause at breakpoint, not the stress testing, appears to be causing consistent fatal errors during the continue following right after. I removed it so the test doesn't fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which platform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I think this is the problem with how we do the |
|
@derekparker Is there anything else that needs to be addressed, so this can be merged? Thank you. |
@polinasok this LGTM, please just rebase and then I will merge today. |
@derekparker Rebase done |
Updates #1515
Updates golang/vscode-go#1474