Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve handling of u32 operations #1480

Merged
merged 8 commits into from
Sep 7, 2024
Merged

Conversation

bitwalker
Copy link
Contributor

This PR solves two problems:

  1. It ensures that we raise an ExecutionError when inputs to a u32 operation are invalid, rather than attempting the operation with those inputs and potentially panicking, crashing the VM in the process. This has the effect of making the u32 operations stricter in practice, but this has two primary benefits:
    • These errors are handled gracefully, which means you get a useful error (and source locations in the debugger), rather than a crash with no useful information at all.
    • It makes relying on undefined behavior impractical, forcing you to instead ensure that your program is semantically correct. It is one thing to avoid checked operations because you know that your program will never produce invalid values, and quite another to rely on the fact that the current behavior of the VM will allow you to use some invalid values without punishing you. This is no longer the case, and the VM now punishes you for violating the semantics of the instruction set.
  2. It fixes the places where we ourselves were relying on undefined/undocumented behavior of the VM, which was identified as a result of the stricter handling of invalid u32 operands.

See the first commit in the changeset for a more in-depth description of the changes/rationale, but the above is a pretty good summary.

Rationale

Aside from needing to fix the initial issue of certain u32 operations triggering a panic in the VM, it is not apparent to me why we would want to allow invalid operands to these instructions. The fact that we happened to allow it at all is basically arbitrary, and has the effect of allowing bugs to creep into programs very subtly. The only valid reason to make room for undefined behavior at all, is to allow for optimizations to be made based on the invariants of the instruction - optimizations which tend to be wildly unsafe if the invariants are not upheld. We have no such optimizations implemented, nor planned, so I believe we should ensure that any reliance on the currently documented undefined behavior, results in an execution error, to better aid in debugging and validating the correctness of programs.

Alternatives

An alternative approach is to try and promote the behavior we were relying on to "defined behavior". For example, if we think it is imperative to allow $$2^{32}$$ as a valid value for u32 operations, we could explicitly document the behavior of the various operations for that specific value. I should note, as it can be easy to forget, that $$2^{32}$$ is not a valid u32 value, as it requires 33 bits to represent, but it seems to be the dominant reason why we were relying on undefined behavior in our own ops. As you can see from the commits in this PR though, it is not actually necessary for us to do so, though there may be some small overhead of a couple cycles to avoid it.

We could also attempt to make the degree of "strictness" configurable, so that we only produce an ExecutionError if the operation would cause the VM to panic otherwise. I'm not fond of this idea, as it adds additional complexity, for the sole benefit of allowing one to rely on undefined behavior - that doesn't seem wise.


NOTE: I removed test assertions from the u32 operation tests that specifically asserted that those ops did not fail on out-of-bounds operands. Not only does it not make sense for us to be testing undefined behavior in the first place, it suggests that the undocumented behavior under test is relied upon, which we definitely do not want to encourage.

TODO

  • Update documented cycle counts for operations affected by changes (is there a way we can automate this somehow? Sure would be nice if we didn't have to compile that information by hand). I'd use the playground or the VM, but the cycle counts obtained from those include a bunch of irrelevant stuff, so obtaining a precise count is a bit challenging.
  • Update documentation to clarify the behavior of the VM with regard to undefined behavior, namely that the VM will now trap on invalid operands, but that this can change at any time, without any warning.

@bitwalker bitwalker added enhancement New feature or request assembly Related to Miden assembly processor Related to Miden VM processor labels Sep 5, 2024
@bitwalker bitwalker self-assigned this Sep 5, 2024
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! And thank you for fixing all the bugs. I left some small comments inline. Other than that, once the documentation is updated, we can merge.

In terms of why this was happening in the first place:

  • Trying to use $2^{32}$ in u32 operations is clearly a bug and should not have been happening.
  • Not checking inputs for u32 operations was done intentionally. The main reasoning for this is that AIR constraints don't perform these checks and we wanted to keep execution semantics as close to the verification semantics as possible. But thinking more about this, I'm actually not sure that's a good strategy - maybe it is better to make the executor "stricter" than the verifier so that we can catch more bugs upfront.

One important note on verification: if a program has any kind of undefined behavior, we cannot make any guarantees about validity of generated proofs. So, if a program is using a non-u32 value as an input into a u32 operation, proofs generated for such a program should not be considered valid. Even if we catch errors for such programs with our processor, someone else could build a malicious processor which would generate a proof against the program's MAST root, and such proof may pass verification when it shouldn't.

So, it is actually super important to make sure programs don't have undefined behavior. The main reason why we allowed un-checked inputs for u32 operations was performance - but I'm now wondering if we should make these checks explicitly.

Basically, catching undefined behavior for u32 operations in the processor is a step in the right direction, but it doesn't make u32 operations "safe" from the verifier's standpoint. For this, we should either add explicit checks via U32Assert2 operation or add constraints to the constraint system to check this. Both have non-negligible costs, but the latter approach may be a bit less costly once we have LogUp-GKR integrated into the VM.

assembly/src/assembler/instruction/u32_ops.rs Outdated Show resolved Hide resolved
stdlib/asm/math/u64.masm Show resolved Hide resolved
assembly/src/assembler/instruction/u32_ops.rs Outdated Show resolved Hide resolved
assembly/src/assembler/instruction/u32_ops.rs Outdated Show resolved Hide resolved
assembly/src/assembler/instruction/u32_ops.rs Outdated Show resolved Hide resolved
@bitwalker
Copy link
Contributor Author

Not checking inputs for u32 operations was done intentionally. The main reasoning for this is that AIR constraints don't perform these checks and we wanted to keep execution semantics as close to the verification semantics as possible. But thinking more about this, I'm actually not sure that's a good strategy - maybe it is better to make the executor "stricter" than the verifier so that we can catch more bugs upfront.

Yeah that was more or less my impression on why the operations were unchecked. Likewise, I 100% agree it is better for the executor to be stricter about inputs than the verifier, even if the additional checks are not part of the proof. It at a minimum ensures that a proof generated via our executor, never exhibits undefined behavior.

One important note on verification: if a program has any kind of undefined behavior, we cannot make any guarantees about validity of generated proofs. So, if a program is using a non-u32 value as an input into a u32 operation, proofs generated for such a program should not be considered valid.

Ah, that's interesting - I was aware that being extra strict in the executor would not be reflected in the constraints, and thus not part of the proof; but I don't think I realized that the niches of undefined behavior in Miden's instruction set would corrupt the proof. Intuitively, it makes sense that if you give an invalid u32 value to a u32 operation, that the guarantees of that operation are no longer valid, but I think I presumed that the constraints themselves were operating in the prime field, and thus might permit values outside of the u32 range similar to how the executor was. Obviously it depends on the constraints though.

My main question here then is: is there anything, other than the executor itself, that can tell us if a program generates an invalid proof (where invalid here means "doesn't prove what it says it proves")? If not, then I'd be interested to know if that's a gap we can close. I think one option would be to remove undefined behavior from the instruction set in two parts:

  1. Update the executor to validate that inputs are always valid, and change the documentation to state that these operations will cause a trap if the inputs are not valid. This PR does both of those, so that's easy.
  2. Formalize undefined behavior semantics in Miden, defining it such that if it occurs at runtime, the program immediately traps/aborts. I suspect this means that an extra flag is needed in the trace, which we can call poison or ub, which is set if > u32::MAX value is given to a u32 operation (or more generally, if an input would cause undefined behavior). The output of the operation is considered poison or undef (to use LLVM concepts here), and either:
    a. Program execution terminates on the next cycle where the poison/ub flag is set to 1 (this seems like the most straightforward)
    b. Program execution continues until an attempt to use the poison value is made, at which point it terminates.
    c. Program execution continues normally, but the final proof is flagged as poisoned, so that any attempt to use that proof to prove something, is flagged as invalid. (this also seems like a reasonable option, since it fixes the main issue with undefined behavior you've outlined, which is that the proof is invalidated).

Obviously one can use u32assert or u32assert2 to validate operands for the u32 operations, but that's entirely optional, and people will be inclined to skip them for optimization purposes, if they think they can get away with it. I think we should probably consider how to make undefined behavior illegal (to the extent that it invalidates the generated proof), so that you can't construct a program that appears to execute successfully, but comes with a proof that isn't valid. Adding the extra checks as done in this PR has that effect when using our executor, but as you mentioned, someone could use a different executor without those protections, and nobody would be the wiser.

Anyways, for now we at least will be notified that code we've written contains undefined behavior, so this may be lower priority, but definitely seems like something we should try to address "properly".

@bitwalker bitwalker force-pushed the bitwalker/checked-arithmetic branch 2 times, most recently from e8d99a6 to e8a9951 Compare September 7, 2024 19:01
This commit modifies the way we handle invalid inputs to u32 operations.
The behavior being modified here is documented as undefined, so is not a
breaking change, and as you will see, should be an improvement in
developer experience.

Previously, virtually all of the u32 operations whose implementations
rely on the operands being valid u32 values, would simply cause the VM
to panic if that assumption was violated in an egregious enough fashion,
namely if the operands happened to be large enough to cause the
equivalent operation on u64 operands to fail.

While somewhat limited in impact in practice, this is quite painful if
you have a bug in your program that causes a very large value to be used
as an operand to a u32 operation. Not only would this bring the entire
VM down, it would also prevent you from identifying _why_ it failed,
even if stepping cycle-by-cycle. This primarily impacts the arithmetic
operators, but really just depends on how a given instruction is
implemented.

More problematic, in my opinion, is that while the behavior of the u32
operations with invalid u32 operands is documented as undefined, we were
relying on the undocumented behavior of the VM in various places,
notably in the implementations of some of the more complex bitwise
operations, e.g. clz/clo/rotr. As a result, if we were to ever change
that undocumented behavior, subtle bugs could be introduced. In this
case, we're making the behavior more strict, so we're immediately made
aware that these operations are broken (and thus, fix them), but we
could have easily done something different that would not have been so
easy to detect.

This commit specifically makes most of the u32 ops raise an execution
error if an invalid u32 operand is given to them. This does _not_ make
these checked operations in terms of the VM constraints, but it does
make them behave like checked operations, in so far as they will cause
the VM to trap if an invalid operand is used. This stricter behavior
makes you aware immediately when your program relies on undefined
behavior, and forces you to handle it properly.
@bitwalker
Copy link
Contributor Author

I've added a section to the documentation for u32 operations that describes how we handle undefined behavior currently, and mentions the issue of proof invalidation in the presence of undefined behavior. Let me know if you want me to reword any of it.

@bitwalker
Copy link
Contributor Author

I think I've updated all the cycle counts and relevant docs. Let me know if there is anywhere I missed though

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good! Thank you! (the section on undefined behavior is awesome).

I've pushed a small commit which fixes cycle counts in a few places. As soon as CI passes, I'll merge.

@bobbinth
Copy link
Contributor

bobbinth commented Sep 7, 2024

My main question here then is: is there anything, other than the executor itself, that can tell us if a program generates an invalid proof (where invalid here means "doesn't prove what it says it proves")?

As far as I know, this is currently an unsolved problem. One source of vulnerabilities is the type of behavior we discussed in this issue - i.e., some undefined behavior in the VM which is not covered by constraints. I think we should try to eliminate this even if comes at some performance penalty.

Another source of issues is general use of non-determinism. If there some bug in the program itself where values read from the advice provider are not verified for some reason, it may be possible to make such a program return any value. A potential solution here is severely restrict how non-determinism is used - but this may also impose pretty significant limitations.

But by far the biggest issue is a "missing constraint" issue - i.e., when the description of the VM is under-constrained and so a malicious executor can execute some instructions incorrectly (or update memory state incorrectly or w/e). These types of issues would need to be caught at the constraint level (e.g., in AirScript). One approach is to formally verify all the constraints (e.g., Cairo constraints are formally verified in Lean, I believe) - but even this doesn't provide a 100% guarantee.

One things that I've been thinking about is whether we can add some things to AirScript which would make missing constraint issues much easier to detect.

@bobbinth bobbinth merged commit 2c46076 into next Sep 7, 2024
9 checks passed
@bobbinth bobbinth deleted the bitwalker/checked-arithmetic branch September 7, 2024 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
assembly Related to Miden assembly enhancement New feature or request processor Related to Miden VM processor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants