-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC/WIP: Support DAP disassemble request #627
Conversation
CodeLLDB currently suppports a custom disassembly view and provides disassembly as "source" when debugging into objects with no sourceline info. DAP now also has a `disassemble` request which, given a memory refernce from the stack trace, produces a set number of instructions from that address. This API is awkward and annoying, but it's simple to implement based on the existing DisassembledRange. WIP. - we don't return the exact number of instructions - we don't populate a lot of the optional fields - my-first-rust(TM) - no tests yet
Thanks! Yes, I'd like to implement native DAP disassembly support at some point, and gave it a try a while back, in fact. However, I was not able to satisfactorily resolve the question of how to handle disassembling backwards, and then I got busy, so that stuff is on hold for now. If you'd like to think about it, here's my branch. |
Thanks I’ll take a look |
I'm not sure I fully followed this. Are you referring to something like a negative One idea springs to mind:
This seems like it might be possible in theory. Need to look at the api for the practice though. I think it might be possible by using the SBCompileUnit directly. WDYT? That aside, for now, I took your branch and added source line info to the disassembled instructions and tat seems to work with my (extremely limited) client implementation. I'll try and dig through the LLDB api to see if there's anything we can do about negative instruction offsets, but would just bailing out and not supporting that be an option? |
Yes, that.
Don't think so. First and foremost, this is a VSCode extension, and VSCode's implementation of disassembly view uses negative offsets extensively.
This will likely break in release builds: the optimizer may rearrange instructions such that they are not longer in line order. Also, disassembling must be able to function without any debug info whatsoever. I can think of two methods:
I expect that a robust implementation will require quite a bit of research and experimentation. ...I bet there is a blog post or a mailing list discussion somewhere on the internet which has all the tips and tricks, because the problem is definitely not new. However so far I've been unsuccessful in locating it 🤷♂️ |
This branch has conflicts that must be resolved |
I still have this on my TODO list by the way. I notice that vscode-cpptools seems to support a negative offset so it might be possible to reverse engineer what they do and pick it up again. Just need that "free" time people keep talking about :) |
OK, so this is what MIEngine does: private async Task<DisasmInstruction[]> VerifyDisassembly(DisasmInstruction[] instructions, ulong startAddress, ulong endAddress, ulong targetAddress)
{
if (startAddress > targetAddress || targetAddress > endAddress)
{
return instructions;
}
var originalInstructions = instructions;
int count = 0;
while (instructions != null && (instructions.Length == 0 || Array.Find(instructions, (i)=>i.Addr == targetAddress) == null) && count < _process.MaxInstructionSize)
{
count++;
startAddress--; // back up one byte
instructions = await Disassemble(_process, startAddress, endAddress); // try again
}
return instructions == null ? originalInstructions : instructions;
} So basically:
I don't love it, but I also don't hate it. What do you think? FWIW this is what they do to calculate the "MaxSizeOfOneInstruction", which was my next question :)
|
well, believe it or not, it works. I'll tidy it up a bit and push a new PR. |
What happens if startAddress lands in the middle of an instruction, such that the trailing bytes just happen to encode a valid instruction? |
If the start address happens to be mid-instruction and that resolves to a valid instruction then one of a few things might happen:
I need to craft some careful test cases around this. Sorry if the above explanation is not very clear. My WIP commit message is below and the change is here - it's still WIP and the code is terrible, but hopefully you get the idea:
|
Hello, I was taking a look at these changes to enable disassemble requests. |
LLDB API doesn’t provide a way to do a negative offset read. This is also more complex due to the variable length of instructions in x86 (hence the read memory gymnastics) see explanation here #627 (comment) |
I'm writing a custom extension and your implementation is being a good guidance! Still, I'm having problems when VS Code asks for a large offset (e.g. Is there anything I'm missing? |
Could be a bug. Please can you raise an issue with steps to repro using codelldb and I can take a look. |
Actually, in codelldb it works fine. That's why I was wondering where that case is handled in codelldb code. I'm trying to do something similar but using the VS Code embedded Open Disassembly View. I implemented a similar logic to what you've done but I'm bumping into problems since, after apply the negative offset requested by VS Code, I end up outside the current stack frame (i.e. I need to add a check on the start address but I didn't find any easy way to retrieve the stack frame start address from lldb. |
The code for handing the disassemble request is here https://github.com/vadimcn/vscode-lldb/blob/master/adapter/src/debug_session.rs#L1134. I don’t know anything about vscode. |
Yes, that's what I was already looking at and using as a reference. |
I'm really struggling to understand what you're asking for. If you think the above code doesn't work in some scenario, I'm happy to look into that. I can assure you that I tested negative offsets that go outside the definition of the current "function". Even outside the binary image. I'm not sure what "stack frame" per se has to do with it. Disassembly is just taking a chunk of memory and trying to interpret that byte stream as instructions. Often the memory isn't actually instructions and you get various forms of invalid (or NOP) instruction instead. The idea of the above code is that it tries to determine a valid start address by heuristically disassembling various bytes (up to one instruction width back from the calculated start address) and looking to see if it "looks" valid. The stack is really not involved unless the code location happens to be very close to where the stack is in memory. |
Your implementation works perfectly, I was having a problem on my side. Thanks for the reply! |
Opening this up as a 'request for comments'. I had a quick go at implementing the dap disassemble request in CodeLLDB as I wanted something reliable to test Vimspector's disassemble view with.
Clearly this is a prototype. Would you be interested in a proper patch to support the DAP disassemble request?
CodeLLDB currently suppports a custom disassembly view and provides
disassembly as "source" when debugging into objects with no sourceline
info.
DAP now also has a
disassemble
request which, given a memory referncefrom the stack trace, produces a set number of instructions from that
address.
This is simple to implement based on the existing DisassembledRange.
WIP.