Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] An attempt to remove assert from the core specs #2045

Closed
wants to merge 2 commits into from

Conversation

hwwhww
Copy link
Contributor

@hwwhww hwwhww commented Sep 9, 2020

Issue

An attempt to address #1945 and #1797.

Proposed solutions

Thank @ericsson49 for pointing out Nagini. Although I have some concerns about using Nagini directly, we can implement similar declarations without overhauling the spec.

Note that this PR is not meant to formally verify the spec. Dafny team is working on FV spec separately.

Proposed solution

Replace assert with Requires and Ensures helpers. See declarations. The helpers will raise ValidationError if the conditions are not satisfied.

TBD

Pros:

Cons:

  • Compare to throwing the exceptions directly, it's more difficult to debug Pyspec.
  • Need to follow the new pattern carefully.

/cc @franck44

@hwwhww hwwhww added the general:presentation Presentation (as opposed to content) label Sep 9, 2020
@hwwhww hwwhww marked this pull request as ready for review September 9, 2020 19:19
@protolambda
Copy link
Contributor

This is great, and agree assertions are some weird hack, but for testing they do what they do really well, tooling understands the evaluation of them, gives more meaningful exceptions, etc.

If we're going for something like Nagini, I'd rather not redefine the functions in the beacon-spec phase0 doc, and move it to some separate doc, focused at just these types of helper functions. Especially if we want to expand it with invariants, and maybe go into detail in spec assumptions that apply globally (index lookups, no wrap-arounds, no integer under/overflows, etc.).

@djrtwo
Copy link
Contributor

djrtwo commented Mar 10, 2021

What's the status here @hwwhww. Is this something you'd like to still consider?

@ericsson49
Copy link
Contributor

ericsson49 commented Apr 6, 2021

I think it's important to clarify semantics of such assertions/declarations.
They are used extensively in somewhat varying forms, so just writing assert or require may confuse a reader (#1797 is an example).
There can be several confusing aspects. I can identify at least two:

  • "retention policy", where an assertion/declaration should be checked/verified/monitored: runtime, testing/verification, specification (e.g. to communicate developer intentions to reader).
  • "behavior kind" that an assertion/declaration (implicitly) refers to. E.g. normal behavior (typical behavior of a correct software), exceptional behavior (can happen in a correct software, e.g. to handle erroneous input or state) or abnormal behavior (shouldn't happen in a correct software).

Let's consider some examples:

  • assert in traditional S/W engineering: typically intended as a guard against abnormal behavior. E.g. a developer believes that some conditions shouldn't happen but cannot prove it. So, an assert is inserted to notify developers that a counter-example is found. It thus enabled during debug/testing, but may be omitted in production code for performance reasons.
  • assert in formal methods. Quite similar to traditional S/W engineering, assert is used to post condition that should always hold. However, it can also be used to guide a prover or a model checker, however, it's still a condition that shouldn't happen, i.e. refers to abnormal behavior.
  • validating input parameters or state. In order to handle erroneous inputs or state one typically throws an exception or signals a bad status in another way (e.g. return a bad code). It's a part of a correctly behavior software. So, if such a checks is switched off that can make software behave incorrectly. That's especially true for highly dependable software, including BFT protocols.
  • requires in formal methods. It typically means a pre-condition. It's a form of specifying behavior. Though a verification condition generator typically translates it to an assert (at call site) or assume (beginning of a function/method) statement.
  • ensures/exsures in formal methods. It typically means a post-condition. Some F/M tools (e.g. JML, Nagini) explicitly distinguish normal and exceptional behavior - ensures vs exsures. In Dafny, there are no exceptions, however, there is some support for handling failures.
  • require in Solidity. Combines pre-condition, post-condition and input/state validation semantics. I.e. it can be called at the beginning of a method (check pre-condition/input argument/state) and after calling a method - basically, ensuring a post-condition is met.
  • invariants. There can be specific kinds like loop invariants, class invartiants, etc. Typically, an invariant is both a pre- and post-condition. E.g. invariant should hold before invoking a piece of code (loop body or a method) and should be preserved by the piece of code. It may not hold inside the piece of code.

Based on the above and that intention of the proposal differs from verification, I think that it's perhaps best to follow the Solidity semantics of require/assert/revert.
First, it's likely that a reader a familiar with them.
Second, they look like a good match to pyspecs: e.g. methods in pyspecs need both to validate arguments/states and communicate pre-conditions.
Third, they are flexible: require to check/validate pre-conditions and post-conditions. assert to check for internal errors and invariants (it's consistent with how assert is typically used), revert to throw exceptions in general.

They however, can be adapted to pyspecs, based on Nagini declarations:

  • add Ensures/Exsures to communicate (and perhaps enforce pre-conditions, including normal and exceptional behavior)
  • explicit syntax for invariants (Invariant)
  • python assert perhaps in addition to more expressible Assert

I'd summarize proposed semantics:

  • Requires is enforced in runtime and also serves as a way to communicate pre-conditions to formal tools (they can prove that a pre-condition is never triggered). However, Requires may fail during correct software behavior, e.g. if received an erroneous external message. It should be placed at the beginning of a method/function.
  • Optionally, Requires may be used to check post-conditions after invoking some other method/function.
  • Ensures/Exsures is used to communicate post-conditions. Can be verified by formal method tools, e.g. translated to asserts at the end of a method/function and assumes at a call site after invocation of the method. Can be omitted at runtime
  • assert and/or Assert can be used to check for internal errors (which shouldn't happen in a correct software). Translated to assert by formal method tools (can be followed by assume of the same condition). Can be omitted at runtime.
  • Invariant - can be used to communicate conditions that should be true before and after execution of relevant pieces of code. Translated to assert and can be omitted at runtime.
  • raise can be used as Python native way instead of revert. However, Revert may be helpful to hide explicit raise and attach additional connotations.

One concern here is that Requires is proposed to be enforced at runtime (i.e. following the Solidity approach), while assert, 'Ensures, Exsures, Invariantcan be omitted at runtime. That may be a bit confusing. At the same time, duplicating checks makes code less readable. For me, Solidityrequire` looks like a good tradeoff.

There are also some tricky semantic issues, when combining input/state validation semantics with pre-condition semantics. Basically, pre-conditions should hold before a method is invoked. While input/state validating can be interleaved with other computations (including state updates). So, such combined semantics cannot be typically used to communicate pre-conditions to formal method tools, at least directly, without some processing. For example

def on_attestation(store: Store, attestation: Attestation) -> None:
    validate_on_attestation(store, attestation)
    store_target_checkpoint_state(store, attestation.data.target)
    target_state = store.checkpoint_states[attestation.data.target]
    indexed_attestation = get_indexed_attestation(target_state, attestation)
    assert is_valid_indexed_attestation(target_state, indexed_attestation)
    update_latest_messages(store, indexed_attestation.attesting_indices, attestation)

assert is_valid_indexed_attestation(target_state, indexed_attestation) is invoked after modifying state - store_target_checkpoint_state(store, attestation.data.target) (I mean state in general here, not BeaconState).

It would be great if such annotations can be used as an input to formal method tools too. Basically, code annotating is perhaps the biggest problem when applying formal methods. So, reducing efforts and simplifying synchronization of pyspecs and pyspecs verification code is an important problem. However, full annotation can easily render pyspecs unreadable.

@djrtwo
Copy link
Contributor

djrtwo commented Apr 20, 2021

A few questions:

  1. Regarding Ensures -- you state "Can be omitted at runtime". Aren't these statements in our context important for runtime? e.g. Ensuring that an attestation slashing does in fact slash at least one validator, otherwise it is an invalid operation. Or do you see this as an invalid use of Ensure because it is essentially validating the input in some way?
  2. If Ensure could be used for things such as checking the post state_root and not disabled at runtime, what is the advantage of require/assert/revert over Requires/Ensure? Is it just that the solidity semantics might be easy for our readers to understand? If that is the sole driver, I think Requires/Ensures might be clearer and pretty simple to grok.
  3. I see you make the distinction between "checking" post conditions and "communicating" post conditions. What is the practical difference here? I'm a bit confused why you would use one over the other.
  4. If we go with Requires across the board, in essence, we are using a similar structure to assert but we just can't turn it off at runtime (which is good). Right?

@ericsson49
Copy link
Contributor

  1. I see you make the distinction between "checking" post conditions and "communicating" post conditions. What is the practical difference here? I'm a bit confused why you would use one over the other.

Yes, this is the key distinction. Basically, it is about distinguishing run-time and verification-time contexts.
Let's consider a simple example

x: int = ...
assert x*x >= 0

The assert should never fail during runtime. So, turning off the check doesn't change behavior.
While in the example, it's evident that it always holds, there can be more tricky cases.
So, "communicating" a post condition (or some other statement) means that ones wants a prover to try to prove or invalidate it (i.e. find a counter-example). But if one believes the property always holds, the corresponding assert can be turned off.

Another example,

def func(x: int) -> unit64:
  ...

In Python, such annotations are not enforced during run-time. But they can be enforced via a static checker. One can also imagine a Python interpreter which does check them during run-time.
So, they can be seen as a way to communicate pre- and post-conditions to a prover (i.e. the static checker).
x: int means something like Requires(isinstance(x, int) and -> uint64 means something like Ensures(isinstance(result, uint64).

One can imagine more complex example, i.e.

def max(a: int, b: int) -> int:
  Ensures(result >= a and result >= b and (result == a or result == b))
  return a if a >= b else b

So, the post-condition cab be a goal for a prover to verify. One can also use post-conditions (together with pre-conditions) to abstract a method away during symbolic execution. E.g. if there is a call c = max(0,b) it can be replaced with something like

havoc(c)
assume(c >= 0 and c >= b and (c == 0 or c == b))

which basically means "for all c satisfying c >= 0 and c >= b and (c == 0 or c == b)".

It's also true that one can perform verification during runtime - i.e. leave assert's turned on or instrument code with appropriate checks. This also can be considered as "communicating" conditions in the sense that they are used in an instrumented/debug version (which acts as a "prover" here), but can be in principle omitted in a production or an optimized version. Which is typically how asserts are used.

In general, however, not every condition can be checked during run-time, but some of them can be checked symbolically. For example. forall x: int, ... means one should be able to enumerate all ints during runtime, which is not possible.

So, the idea basically is to be able to communicate some stuff to provers (of various kinds, including human inspectors, runtime verifiers, static analysis tools or full blown formal methods), while not being obliged to keep it during runtime.

@ericsson49
Copy link
Contributor

  1. If Ensure could be used for things such as checking the post state_root and not disabled at runtime, what is the advantage of require/assert/revert over Requires/Ensure? Is it just that the solidity semantics might be easy for our readers to understand? If that is the sole driver, I think Requires/Ensures might be clearer and pretty simple to grok.

There are advantages (like more expressive conditions, e.g. the forall stuff I've mentioned before and faster code, if one can legally turn off some checks).
In the case of the issue, I'd rather say that the only driver is " the solidity semantics might be easy for our readers to understand".

  1. If we go with Requires across the board, in essence, we are using a similar structure to assert but we just can't turn it off at runtime (which is good). Right?

Yes. However, there are semantic differences too:

  • require in solidity validate input parameters
  • Requires (e.g. in the context of formal methods) refers to a pre-condition statement, which should be true before the method begins
  • assert is typically expected to check that bad things do not happen in a correctly behaving software.

@ericsson49
Copy link
Contributor

  1. Regarding Ensures -- you state "Can be omitted at runtime". Aren't these statements in our context important for runtime? e.g. Ensuring that an attestation slashing does in fact slash at least one validator, otherwise it is an invalid operation. Or do you see this as an invalid use of Ensure because it is essentially validating the input in some way?

The latter one. Ensures in Nagini specifies a post-condition. So, using it to validate the input will confuse formal method guys as the actual post-condition is different (should state something like "bad validators are slashed and the rest stuff stays the same" in the case of process_attester_slashing).

@ericsson49
Copy link
Contributor

ericsson49 commented Apr 21, 2021

  1. If we go with Requires across the board, in essence, we are using a similar structure to assert but we just can't turn it off at runtime (which is good). Right?

I'd also add that the simplest change would be to replace asserts with Requires and specify Requires semantics.
As far as I remember, asserts are used in the pyspecs only to validate input.

In this case, I'd also explicitly clarify that:

  • Requires is used to validate input
  • shouldn't be turned off during runtime
  • if its condition is not satisfied, prior state modifications should be reverted (e.g. see discussion in Unclear atomicity semantics of the beacon chain spec #1409)
  • since it's used to validate input, one should be careful when treating it as a pre-condition.

@ericsson49
Copy link
Contributor

Re-thinking the discussion, I have a suggestion to split the PR in two parts:

The rationale is that the issues are rather orthogonal (though related) and can be implemented separately. Additionally, there is a confusion between expressing pre-conditions and input validation logic, discussed above. They are indeed quire similar, but distinct things.

I assume that most of asserts in pyspecs are used to validate inputs. Then the solution to #1797 is to introduce some "word" and replace such asserts with the invocation of the "word" (e.g. Requires). And attach a clarification to the "word" (e.g. something like I sketched above).

A solution to #1945 is more involved, since one need to write actual pre-conditions for methods in addition to inventing appropriate words for expressing pre/post-conditions.

There is some interference between two aspects: Requires and Ensures are perhaps more closely associated with pre-/post-conditions (e.g. Nagini and Stainless are using them to express pre- and post-conditions), which means that using Requires as an assert replacement (with intention of expressing input validation logic) can become another source of confusion.
That's why, in my opinion, it's better to invent another "word" for resolving #1797. it's of course a matter of taste. However, if Requires is used to refer to input validation logic, then one has to invent a new "word" to refer to pre-conditions :).

@hwwhww
Copy link
Contributor Author

hwwhww commented Apr 27, 2021

I have a suggestion to split the PR in two parts

Agreed. 👍 I think it’s good to start with addressing #1797 first.

We can add a "message" field, e.g., Requires(assertion_condition: bool, message: str) -> None, for improving both pyspec debugging experience & readability.

In that case, naming it require as the Solidity style require sounds good to me since it's more understandable for Ethereum devs.

That's why, in my opinion, it's better to invent another "word" for resolving #1797.

Right, understand. I'm open to other new ideas. :)
What do you think about I start to rework the PR with require, and in the meantime, let's keep figuring out the name?

@ericsson49
Copy link
Contributor

ericsson49 commented Apr 27, 2021

In that case, naming it require as the Solidity style require sounds good to me since it's more understandable for Ethereum devs.

I personally like require too.

What do you think about I start to rework the PR with require, and in the meantime, let's keep figuring out the name?

👍 Choosing wording is more like a matter of taste. I don't think it should stop us from going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general:presentation Presentation (as opposed to content)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants