-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Subroutines and Static Jumps for the EVM #3404
Comments
Could you please specify the exact requirements for efficient conversion from stack to register code? I'm asking because I think that this specification does not satisfy such requirements. You can easily convert existing code in the following way: |
Poor man's
|
@chriseth I don't have exact requirements - just a sketch of an algorithm we discussed a while back. But the essentials were that the worst case DFS be linear-time, and that stack depths be the same when control returns to a jumpdest (or call). I think these are the requirements for any analysis wanting to validate or optimize use of the stack. I'm not quite sure what to make of your example - it's a limiting case. But it's not an exact conversion either, as the flow of control will be from the endsub back to the caller. @pirapira I'm not sure I understand your example so early in the morning, but it looks like you are building a ladder of conditionals to replace a single dynamic jump, which will give you are longer program that can still be analysed in linear time. I also see that condition 7 is too strong - it should refer to stack depth relative to the nearest enclosing beginsub. We may also need to prohibit entering a subroutine any way but calling it. |
@gcolvin If you can analyze programs with this gadget in linear time, actually you can also analyze dynamic jumps in linear time. Only one instance of this ladder is necessary in the bytecode, so this gadget consumes (number of JUMPDESTs * 4) instructions. And, jumping into this gadget for free. |
I think you will find that if you apply your gadget to a program with quadratically bad control flow it will become quadratically bigger ;) |
I need to try then. |
Worst case program alternates jumpdest, load from memory, and dynamic jumps. So O(N*N/3). Unpack the code into conditional jumps and you would need N of them for every jump. |
I can reuse the same package for every jump. I posted two files here https://gist.github.com/pirapira/cbd7686860c3ba79aa411d73eac7af73
The translation grows just proportionally. |
You are a very clever man. Is the state of the VM otherwise maintained? Not the PC, obviously, but the stack contents? |
@gcolvin Right, I'm leaving one garbage stack element after every jump. I'll update the example. |
Done, now the code size almost triples. |
@gcolvin if you say that you want the analysis to be done in linear time, then I think it is important to say what kind of analysis you want done. Let's assume the analysis is not possible in linear time, then we are still well of by just forcing the bytecode to already contain the analysis result. We can generate optimized bytecode from that result and if some condition is not satisfied during a jump, we can throw an exception. |
@gcolvin if we say "JUMPDEST must specify the source of the jump", programs are linearly as long as the number of jump edges. |
The practical question being whether Solidtiy can emit code that allows for fast analysis and optimization without these changes? It uses uses dynamic jumps. but in a regular, possible detectable way. |
@chriseth Lots of analyses depends on traversing the control graph, so I'm not trying to specify a particular analysis, just the conditions for traversing the graph in linear time. I could probably express it more clearly in terms of symbolic execution. |
Wouldn't it make more sense separating the discussion of static jumps and subroutines?
Wouldn't |
@axic JUMPDEST is still used for JUMPTO and JUMPTOIF. And dynamic jumps are used by Solidity for subroutines, so we need an alternative. |
Why do you need
How would it be dynamic if there is no dynamic JUMP opcode? |
You don't need it for validation, like you do for dynamic jumps, but it makes analysis easier to know where the destinations are without having to examine every jump. |
@pirapira "JUMPDEST must specify the source of the jump" won't work for subroutines, which can be called from many places. |
@chriseth Went back to my notes to find this sketch of something that could be done with code structured as above. I've edited a little, but it's not a fully worked out idea. Main points are that the code can be symbolically executed (which is a traversal of the control-flow graph) in linear time, and that unless the stack is constrained to have the same depth at basic block boundaries the assignment of stack slots to local variables breaks down. (If stack slots could have more than one type then matching stack depths would be a prerequisite to matching types.) The basic idea is to symbolically execute the EVM code, using DFS to stop cycles.
|
@chriseth One way to phrase the questions would be - |
|
Note that this algorithm does not handle the rules about not nesting subroutines and not jumping in and out of subroutines - just stack alignment. Code for assuring nesting would be easy to add, or it can be handled in the first pass we need to build the JUMPDEST and BEGINSUB tables anyway. Detecting jumps in and out of routines might be harder. I have to wonder, since the stack depth must match, if these jumps are in fact OK for a machine language. When the code jumped into returns it will actually cause the routine that jumped into it to return, and the stacks will remain in a consistent state throughout. So a form of multiple entry points to a routine, or a place for common exit sequences that different routines can branch to, then return. Possibly useful, two less rules to specify, not worth the complications of forbidding? |
@gcolvin: Some suggestions:
Validity:
I would like to add one further "feature":
|
I don't know what you mean by "gas costs of opcodes would be non-local". It sounds like it means that optimizing bytecode is a bad thing. So far as making do with I wasn't trying to prevent Yoichi's construction, just make it unnecessary. The problem isn't so much that we have dynamic jumps but that they are so unconstrained. Yoichi's construction - like JUMPV and JUMPSUBV, and unlike JUMP - doesn't branch to every possible destination unless you want it to. JUMPV and JUMPSUBV are just more clear, (especially for a human coder) with less code and less gas. Are the validity rules you give more correct , or just written with the more complex sub code path concept so as avoid the need for a frame pointer on the return stack? I went with the frame pointer to simplify the rules, and because the implementation would need to maintain it to enforce the rules at runtime anyway. The frame pointer and return stack are not directly accessible, so I think the only visible difference to the programmer is whether RETURNSUB clears off the locals or they have to do so themselves. To get the latter behavior I think all that is needed is to describe RETURNSUB accordingly and add one rule, "The absolute depth of the stack (SP) on execution of a RETURNSUB must be the same as for the corresponding ENTERSUB." |
It's safe to run unvalidated code (if we choose to allow it) only in the sense that it would deterministically fail at runtime the checks it would have failed at validation time. |
I think the feature of "data in code" is independent of this proposal. |
An ENTERSUB executed directly would have to either flow through or be invalid. If it flows through it at least needs to enforce its minimum stack depth - but then where to return to when you hit RETURNSUB? So just being invalid might be cleanest. An empty ENTERSUB at the end would get treated like anything else at the end: as if it had trailing zeros. So it acts like But for |
If we really don't like introducing opcodes it can be made unnecessary as I described above. Estimate the computational complexity of traversing the code as the number of opcodes plus the number of addresses reachable by each JUMP and JUMPI. The latter we can overestimate as the number of JUMPDESTs when the JUMP or JUMPI is not preceded by a PUSH. If the cost of putting code on the blockchain then includes this computational complexity the problem of the unbounded cost of static analysis is limited by the funds needed to incur that cost. |
I want to make a suggestion to replace proposed Also The |
This makes sense as an addition. As a replacement it would hurt performance searching the switch table in the case where a simple vector access would suffice. |
I can't see any useful case for a "simple vector access". |
For switches where the cases are fairly dense you can implement the For other switches the EVM code isn't too bad? And for JUMPSUBV the vector is like a C++ vtable. Gotta run now. |
@chriseth I thought briefly that we had solved the "data in code problem" - neither symbolic nor real execution will traverse unreachable code, so that code could be anything. But that falls afoul of the problem that if an invalid jump lands in the middle of immediate data that just happens to look like a valid jump destination then the execution will go wrong. That's why before we can run code we have to scan it sequentially, skipping over immediate data and building a table of jump destinations. That scan will fail if it runs into data that is not EVM code. However, the sequential scans depend on the assumption that the right amount of immediate data follows each PUSH. So it might not be a worse assumption that static jumps do not land in immediate data. But either assumption is disturbing - it will be possible to have invalid code that passes validation. That is a showstopper for code in an interpreter that assumes it is getting valid code. |
@chrisheth @chfast Pulling this over from Gitter. For JUMPSUBV, the problem is to provide a mechanism that efficiently branches out to a specific set of functions at runtime, like a vtable in c++, or even a callback function. EVM being a closed world every subroutine that a program might actually want to call at any given point can be known in advance, so the idea is that instead of a dynamic jump we can use an index to pick out what routine to call from a table of addresses that can be validated up front. Right now the only place to put a table into EVM code is as immediate data, so for JUMPSUBV that's where it goes. It would be better if - as in wasm and C++ - the tables could go elsewhere. So that would be a another use for the DATA opcode you want Christian, and if we decide to add that I would want a virtual call mechanism that uses it. But failing that, JUMPSUBV is just shorthand for a JUMPV to a set of JUMPSUBs, and I'd gladly drop it from the proposal to keep the proposal minimal. |
As for non-negative frame sizes, they may or may not be absolutely necessary, but the very notion of a negative frame size is strange. It means you have borrowed your callers' stack space. And I'm very nervous about allowing a routine to gobble up as much of its callers' stack frames as it wants (and even call other routines along the way) so long as it refills them with something - anything - before it returns. Especially when it's easy to check for negative sizes at validation time, and when I can't think of what real use they would be. It seems far more likely that negative frame sizes would indicate pretty bad programming errors. |
@chriseth @chfast @pirapira A new draft. |
I think we should be more explicit about the goals of this proposals, e.g.:
|
Thanks, @chriseth. This is much clearer than my statement. |
We are trying to continue editing the document here: https://docs.google.com/document/d/1m3SnoFDXBeXtzhIYJkNUsZDEDsOtn-Dj3jP7jJ1t1EU/edit |
@chriseth @chfast @axic @wanderer @pirapira @Arachnid |
@chfast The validation code is now as you suggested - including the boolean returns, since moving the test made the int returns unnecessary. It does look rather pretty if I may say so myself. |
Wonderful! I reviewed it and commented on some details - I'm already really happy with the document. |
I'm also very happy with the document. We should send it for a review to more peers. I plan to prepare sam example codes that are short enough to be validated mentally using the algorithm. How do you plan to publish it? Maybe we should do a PR this time (not a GitHub issue). |
I'm advised an issue first is still the way to go, PR to follow soon after. Hudson is overhauling the EIP process, so we might be the first to use it. |
Moved to ethereum/EIPs#184 |
Abstract
This EIP introduces new EVM opcodes and conditions on EVM code to support subroutines and static jumps, disallow dynamic jumps, and thereby make EVM code amenable to linear-time static analysis. This will provide for better compilation, interpretation, transpilation, and formal analysis of EVM code.
MOTIVATION
All current implementations of the Ethereum Virtual Machine, including the just-in-time compilers, are much too slow. This proposal identifies a major reason for this and proposes changes to the EVM specification to address the problem.
In particular, it imposes mild restrictions on EVM code and proposes new instructions to help provide for
including eWASM and Solidity
These goals are achieved by
We also propose to validate - in linear time - that EVM contracts correctly use subroutines, avoid misuse of the stack, and meet other safety conditions before placing them on the blockchain. Validated code precludes most runtime exceptions and the need to test for them. And well-behaved control flow and use of the stack makes life easier for interpreters, compilers, formal analysis, and other tools.
THE PROBLEM
Currently the EVM supports dynamic jumps, where the address to jump to is an argument on the stack. These dynamic jumps obscure the structure of the code and thus mostly inhibit control- and data-flow analysis. This puts the quality and speed of optimized compilation fundamentally at odds. Further, since every jump can potentially be to any jump destination in the code, the number of possible paths through the code goes up as the product of the jumps by the number of destinations, as does the time complexity of static analysis. But absent dynamic jumps code can be statically analyzed in linear time.
Static analysis includes validation, and much of optimization, compilation, transpilation, and formal analysis; every part of the tool chain benefits when linear-time analysis is available. In particular, faster control-flow analysis means faster compilation of EVM code to native code, and better data-flow analysis can help the compiler and the interpreter better track the size of the values on the stack and use native 64- and 32-bit operations when possible. Also, conditions which are checked at validation time don’t have to be checked repeatedly at runtime.
Note that analyses of a contract’s bytecode before execution - such as optimizations performed before interpretation, JIT compilation, and on-the-fly machine code generation - must be efficient and linear-time. Otherwise, specially crafted contracts can be used as attack vectors against clients that use static analysis of EVM code before the creation or execution of contracts.
PROPOSAL
We propose to deprecate two existing instructions -
JUMP
andJUMPI
. They take their argument on the stack, which means that unless the value is constant they can jump to anyJUMPDEST
. (In simple cases likePUSH 0 JUMP
the value on the stack can be known to be constant, but in general it's difficult.) We must nonetheless continue to support them in old code.Having deprecated
JUMP
andJUMPI
, we propose new instructions to support their legitimate uses.Preliminaries
These forms
INSTRUCTION x,
INSTRUCTION x, y
INSTRUCTION n, x ...
name instructions with one, two, and two or more arguments, respectively. An instruction is represented in the bytecode as a single-byte opcode. The arguments are laid out as immediate data bytes following the opcode. The size and encoding of immediate data fields is open to change. In particular, easily-parsed variable-length encodings might prove useful for bytecode offsets - which are in practice small but in principle can be large.
Primitives
The two most important uses of
JUMP
andJUMPI
are static jumps and return jumps. Conditional and unconditional static jumps are the mainstay of control flow. Return jumps are implemented as a dynamic jump to a return address pushed on the stack. With the combination of a static jump and a dynamic return jump you can - and Solidity does - implement subroutines. The problem is that static analysis cannot tell the one place the return jump is going, so it must analyze every possibility.Static jumps are provided by
JUMPTO jump_target
JUMPTOIF jump_target
which are the same as
JUMP
andJUMPI
except that they jump to an immediatejump_target
, given as four immediate bytes, rather than an address on the stack.To support subroutines,
BEGINSUB
,JUMPSUB
, andRETURNSUB
are provided. Brief descriptions follow, and full semantics are given below.BEGINSUB n_args, n_results
marks the single entry to a subroutine. The minimum frame size at jump to and the required frame size at return from the subroutine are given as one immediate byte each. The bytecode for a subroutine ends at the next
BEGINSUB
instruction (orBEGINDATA
, below) or at the end of the bytecode.JUMPSUB jump_target
jumps to an immediate subroutine address, given as four immediate bytes.
RETURNSUB
returns from the current subroutine to the instruction following the JUMPSUB that entered it.
These five simple instructions form the primitives of the proposal.
Data
In order to validate subroutines, EVM bytecode must be sequentially scanned matching jumps to their destinations. Since creation code has to contain the runtime code as data, that code might not correctly validate in the creation context and also does not have to be validated prior to the execution of the creation code. Because of that, there needs to be a way to place data into the bytecode that will be skipped over and not validated. Such data will prove useful for other purposes as well.
BEGINDATA
specifies that all of the following bytes to the end of the bytecode are data, and not reachable code.
Derivatives
We also propose two more instructions to provide for some complex but common uses of dynamic jumps: jumptables and indirection. We support these with vectors of offsets as immediate data, which can be selected with an index on the stack.
Jumptables are useful for dense switch statements, and are implemented as instructions similar to this one on most CPUs.
JUMPV n, jumpdest ...
jumps to one of a vector of
n
JUMPDEST
offsets via a zero-based index on the stack. The vector is stored inline in the bytecode, MSB-first. If the index greater than or equal ton - 1
the last (default) offset is used.n
is given as four immediate bytes, allJUMPDEST
offsets as four immediate bytes each.Dynamic jumps are used to implement indirection such as virtual functions and callbacks, which take just a couple of pointer dereferences in C.
JUMPSUBV n, beginsub ...
jumps to one of a vector of
n
BEGINSUB
offsets via a zero-based index on the stack. The vector is stored inline in the bytecode, MSB-first. If the index greater than or equal ton - 1
the last (default) offset is used.n
is given as four immediate bytes, then
offsets as four immediate bytes each.JUMPV
andJUMPSUBV
are not strictly necessary. They provide O(1) operations that can be replaced by O(n) or O(log n) EVM code, but that code will be slower, larger and use more gas for things that can and should be fast, small, and cheap, and that are directly supported in WASM with br_table and call_indirect.SEMANTICS
Jumps to and returns from subroutines are described here in terms of
JUMPSUB
andJUMPSUBV
offsets, andWe will adopt the following conventions to describe the machine state:
PC
is (as usual) the byte offset of the currently executing instruction.SP
corresponds to the number of items on the stack - the stack size. As an offset it addresses the current top of the stack of data values, where new items are pushed.FP
is set toSP - n_args
at entry to the currently executing subroutine.SP - FP
, is the frame size.Placing the frame pointer at the beginning of the arguments rather than the end is unconventional, but better fits our stack semantics and simplifies the remainder of the proposal.
Note that frame pointers and the frame pointer stack, being virtual, are only needed for the following descriptions of subroutine semantics, not for their actual implementation. Also, the return stack is internal to the subroutine mechanism, and not directly accessible to the program.
The first instruction of an array of EVM bytecode begins execution of a main routine with no arguments,
SP
andFP
set to 0, and with one value on the return stack -code size - 1
. (Executing the virtual byte of 0 after this offset causes an EVM to stop. Thus executing aRETURNSUB
with no priorJUMPSUB
orJUMBSUBV
- that is, in the main routine - executes aSTOP
.)Execution of a subroutine begins with
JUMPSUB
orJUMPSUBV
, whichPC
on the return stack,FP
on the virtual frame stack,thus suspending execution of the current subroutine, and
FP
toSP - n_args
, andPC
to the specifiedBEGINSUB
address,thus beginning execution of the new subroutine.
(The main routine is not addressable by
JUMPSUB
instructions.)Execution of a subroutine is suspended during and resumed after execution of nested subroutines, and ends upon encountering a
RETURNSUB
, whichFP
to the top of the virtual frame stack and pops the stack, andPC
to top of the return stack and pops the stackPC
to the next instructionthus resuming execution of the enclosing subroutine or main program.
A
STOP or
RETURN` also ends the execution of a subroutine.VALIDITY
We would like consider EVM code valid if and only if no execution of the program can lead to an exceptional halting state. But we must and will validate code in linear time. So our validation does not consider the code’s data and computations, only its control flow and stack use. This means we will reject programs with invalid code paths, even if those paths cannot be executed at runtime. Most conditions can be validated, and will not need to be checked at runtime; the exceptions are sufficient gas and sufficient stack. So some false negatives and runtime checks are the price we pay for linear time validation.
Execution is as defined in the Yellow Paper - a sequence of changes in the EVM state. The conditions on valid code are preserved by state changes. At runtime, if execution of an instruction would violate a condition the execution is in an exceptional halting state. The yellow paper defines five such states.
We propose to expand and extend the Yellow Paper conditions to handle the new instructions we propose.
To handle the return stack we expand the conditions on stack size:
Given our more detailed description of the data stack we restate condition 3 - stack underflow - as
Since the various
DUP
andSWAP
operations are formalized as taking items off the stack and putting them back on, this preventsDUP
andSWAP
from accessing data below the frame pointer, since taking too many items off of the stack would mean thatSP
is less thanFP
.To handle the new jump instructions and subroutine boundaries we expand the conditions on jumps and jump destinations.
We have two new conditions on execution to ensure consistent use of the stack by subroutines:
Finally, we have one condition that prevents pathological uses of the stack:
In practice, we must test at runtime for conditions 1 and 2 - sufficient gas and sufficient stack. We don’t know how much gas there will be, we don’t know how deep a recursion may go, and analysis of stack depth even for non-recursive programs is non-trivial.
All of the remaining conditions we validate statically.
VALIDATION
Validation comprises two tasks:
We sketch out these two validation functions in pseudo-C below. To simplify the presentation only the five primitives are handled (
JUMPV
andJUMPSUBV
would just add more complexity to loop over their vectors), we assume helper functions for extracting instruction arguments from immediate data and managing the stack pointer and program counter, and some optimizations are forgone.Validating jumps
Validating that jumps are to valid addresses takes two sequential passes over the bytecode - one to build sets of jump destinations and subroutine entry points, another to check that addresses jumped to are in the appropriate sets.
Note that code like this is already run by EVMs to check dynamic jumps, including building the jump destination set every time a contract is run, and doing a lookup in the jump destination set before every jump.
Validating subroutines
This function can be seen as a symbolic execution of each subroutine in the EVM code, where only the effect of the instructions on the state being validated is computed. Thus the structure of this function is very similar to an EVM interpreter. This function can also be seen as an acyclic traversal of the directed graph formed by taking instructions as vertexes and sequential and branching connections as edges, checking conditions along the way. The traversal is accomplished via recursion, and cycles are broken by returning when a vertex which has already been visited is reached.
The basic approach is to start out calling
validate_subroutine(i, 0, 0)
, where i is the first instruction in the EVM code. Withinvalidate_subroutine()
we process instructions sequentially, recursing when JUMP instructions are encountered. When a destination is reached that has been visited before,validate_subroutine()
returns, thus breaking cycles.validate_subroutine()
returns true if the subroutine is valid, false otherwise.COSTS
All of the instructions are O(1) with a low constant, requiring just a few machine operations each, whereas a
JUMP
orJUMPI
must do an O(log n) binary search of an array ofJUMPDEST
offsets before every jump. With the cost ofJUMPI
being high and the cost ofJUMP
being mid, I suggest the cost ofJUMPV
andJUMPSUBV
should bemid
,JUMPSUB
andJUMPTOIF
should below
, andJUMPTO
should beverylow
. Measurement will tell.GETTING THERE FROM HERE
These changes would need to be implemented in phases at decent intervals:
If desired, the period of deprecation can be extended indefinitely by continuing to accept code not versioned as new - but without validation. Since we must continue to run old code this is not technically difficult.
Implementation of this proposal need not be difficult, At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise. The new opcodes require only a stack for the return offsets and the few pushes, pops, and assignments described above. JIT code can use native calls. Further optimizations include minimizing runtime checks for exceptions and taking advantage of validated code wherever possible.
The text was updated successfully, but these errors were encountered: