-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix assembling instructions with unknown/don't care context bits #7195
base: master
Are you sure you want to change the base?
Conversation
Without this change, if unspecified context bits are provided to the assembler they are defaulted to 0 and the resulting context is used to filter for valid assembly instructions. After this change unspecified bits are kept as unspecified through the assembly process possibly providing more valid assembly results.
We've just pushed some updated test cases which illustrate (at least some of) the current limitations of this implementation. The proposed solution in this pull request currently may produce unexpected results when the assembler resolves However, when the context provided to the assembler only has We have started to experiment with possible fixes, but we'd greatly appreciate any tips about the right way to proceed. |
It's been a while since I've read through this code, but yes, the troubles you're having around I think part of the core issue is that context bits get (ab)used for all sorts of interesting purposes, and believe me, I'm no stranger to abusing context bits.... That said, I suspect what you intend to get by leaving the context unspecified is "all possible encodings that might appear in a binary." Though that context may change wildly during disassembly of a single instruction, there's actually only so many possible input contexts when starting an instruction. We have a notion of "default context" and "global sets". The latter is also sometimes called "context commits." When the disassembler starts, it uses the default context for the processor as input. Some instructions in the Slaspec may use the "globalset" function in its context transitions. Those will place context changes that persist at a given address. When the disassembler encounters that address it applies the change to the input context for that instruction. So, what I might suggest:
Hope this helps. |
…ecified context bits
We gave your suggestions a try; here’s an overview of our scraping attempts and the issues we encountered: Initially, we attempted to use We improved this approach by scraping for context variables used in To mitigate these issues, we began scraping the bit pattern section of each constructor to check the constraints of context variables we know get While our scraping approach appears to work for trivial cases (e.g., ARM/Thumb), an approach that doesn't scrape or call There's a remaining TODO regarding how to best ensure that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm good with everything except the setContext
method being added. My recommendations below are pretty thorough, but perhaps a bit prescriptive. I'll probably take a hard line on withContext
and withContextBuilder
, but I'll try to be flexible with the other stuff.
public AssemblyResolutionResults resolveTree( | ||
AssemblyParseResult parse, Address at, AssemblyPatternBlock ctx) { | ||
|
||
if (inputContexts == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the scraping should be done by WildSleighAssemblerBuilder
. Adding a mutable field could break thread safety, not that it matters, but conventionally, this is definitely a builder step. I can appreciate the lazy computation, but building already takes some time, so I don't think that's necessary.
|
||
assertTrue("Expect to have one valid encoding", allValidEncodings.size() == 1); | ||
// In this case, wildcard assembler now returns identical encodings with different contexts | ||
assertTrue("Expect to have at least one valid encoding", allValidEncodings.size() > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic, but if the English says "at least one", the code should say >= 1
.
* @param that the other block | ||
* @return the new combined block | ||
*/ | ||
public AssemblyPatternBlock combinePrecedence(AssemblyPatternBlock that) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
combinePrecedence
works. I might suggest assign
as a shorter name, though.
* Write mask bits from context commit to mask array of block | ||
* | ||
* <p> | ||
* This is used when scraping for valid input contexts to determine which context variables are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tag this with @implNote
instead of <p>
.
@@ -57,7 +57,7 @@ public class DefaultAssemblyResolvedPatterns extends AbstractAssemblyResolution | |||
|
|||
protected final Constructor cons; | |||
protected final AssemblyPatternBlock ins; | |||
protected final AssemblyPatternBlock ctx; | |||
protected AssemblyPatternBlock ctx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to retain the immutability of this object. Please put final
back. See comment below re/ setContext
.
AssemblyResolutionResults results = super.resolveTree(parse, at, combinedCtx); | ||
|
||
setContexts(results, combinedCtx); | ||
allResults.absorb(results); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call to absorb
becomes redundant if you take the suggestion below re/ absorbWithContext
.
} | ||
|
||
/** | ||
* Get set of all all valid input contexts that affect constructor selection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate all
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't own this package, so this'll need additional review by others. I don't anticipate issues, but it's possible after I Approve this, I'll come back with additional comments from internal review.
/** | ||
* Get set of all all valid input contexts that affect constructor selection. | ||
* | ||
* <p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if <p>
is necessary before <ol>
....
AssemblyPatternBlock defaultCtx = new AssemblyDefaultContext(language).getDefault(); | ||
|
||
// Erase the values for posterity; we don't care about them at this point | ||
System.arraycopy(new byte[defaultCtx.getVals().length], 0, defaultCtx.getVals(), 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider Arrays.fill
instead, to avoid allocating a temporary array of 0s.
Without this change, if unspecified context bits are provided to the assembler they are defaulted to 0 and the resulting context is used to filter for valid assembly instructions. After this change unspecified bits are kept as unspecified through the assembly process possibly providing additional valid assembly results.
This change is needed to help support Pickled Canary's support for context.
Submitting on behalf of a colleague: Will R