Subroutines and Static Jumps for the EVM #3404

gcolvin · 2016-11-09T02:51:58Z

EIP: ???
Title: Subroutines and Static Jumps for the EVM
Status:
Type: Core
Author: Greg Colvin, Paweł Bylica, Christian Reitwiessner 
Created: 2016-12-10

Abstract

This EIP introduces new EVM opcodes and conditions on EVM code to support subroutines and static jumps, disallow dynamic jumps, and thereby make EVM code amenable to linear-time static analysis. This will provide for better compilation, interpretation, transpilation, and formal analysis of EVM code.

MOTIVATION

All current implementations of the Ethereum Virtual Machine, including the just-in-time compilers, are much too slow. This proposal identifies a major reason for this and proposes changes to the EVM specification to address the problem.

In particular, it imposes mild restrictions on EVM code and proposes new instructions to help provide for

better-optimized compilation to native code
faster interpretation
faster transpilation to eWASM
better compilation from other languages,
including eWASM and Solidity
better formal analysis tools

These goals are achieved by

disallowing dynamic jumps
introducing subroutines - jumps with return support
disallowing pathological control flow and uses of the stack

We also propose to validate - in linear time - that EVM contracts correctly use subroutines, avoid misuse of the stack, and meet other safety conditions before placing them on the blockchain. Validated code precludes most runtime exceptions and the need to test for them. And well-behaved control flow and use of the stack makes life easier for interpreters, compilers, formal analysis, and other tools.

THE PROBLEM

Currently the EVM supports dynamic jumps, where the address to jump to is an argument on the stack. These dynamic jumps obscure the structure of the code and thus mostly inhibit control- and data-flow analysis. This puts the quality and speed of optimized compilation fundamentally at odds. Further, since every jump can potentially be to any jump destination in the code, the number of possible paths through the code goes up as the product of the jumps by the number of destinations, as does the time complexity of static analysis. But absent dynamic jumps code can be statically analyzed in linear time.

Static analysis includes validation, and much of optimization, compilation, transpilation, and formal analysis; every part of the tool chain benefits when linear-time analysis is available. In particular, faster control-flow analysis means faster compilation of EVM code to native code, and better data-flow analysis can help the compiler and the interpreter better track the size of the values on the stack and use native 64- and 32-bit operations when possible. Also, conditions which are checked at validation time don’t have to be checked repeatedly at runtime.

Note that analyses of a contract’s bytecode before execution - such as optimizations performed before interpretation, JIT compilation, and on-the-fly machine code generation - must be efficient and linear-time. Otherwise, specially crafted contracts can be used as attack vectors against clients that use static analysis of EVM code before the creation or execution of contracts.

PROPOSAL

We propose to deprecate two existing instructions - JUMP and JUMPI. They take their argument on the stack, which means that unless the value is constant they can jump to any JUMPDEST. (In simple cases like PUSH 0 JUMP the value on the stack can be known to be constant, but in general it's difficult.) We must nonetheless continue to support them in old code.

Having deprecated JUMP and JUMPI, we propose new instructions to support their legitimate uses.

Preliminaries

These forms

INSTRUCTION x,
INSTRUCTION x, y
INSTRUCTION n, x ...

name instructions with one, two, and two or more arguments, respectively. An instruction is represented in the bytecode as a single-byte opcode. The arguments are laid out as immediate data bytes following the opcode. The size and encoding of immediate data fields is open to change. In particular, easily-parsed variable-length encodings might prove useful for bytecode offsets - which are in practice small but in principle can be large.

Primitives

The two most important uses of JUMP and JUMPI are static jumps and return jumps. Conditional and unconditional static jumps are the mainstay of control flow. Return jumps are implemented as a dynamic jump to a return address pushed on the stack. With the combination of a static jump and a dynamic return jump you can - and Solidity does - implement subroutines. The problem is that static analysis cannot tell the one place the return jump is going, so it must analyze every possibility.

Static jumps are provided by

JUMPTO jump_target
JUMPTOIF jump_target
which are the same as JUMP and JUMPI except that they jump to an immediate jump_target, given as four immediate bytes, rather than an address on the stack.

To support subroutines, BEGINSUB, JUMPSUB, and RETURNSUB are provided. Brief descriptions follow, and full semantics are given below.

BEGINSUB n_args, n_results
marks the single entry to a subroutine. The minimum frame size at jump to and the required frame size at return from the subroutine are given as one immediate byte each. The bytecode for a subroutine ends at the next BEGINSUB instruction (or BEGINDATA, below) or at the end of the bytecode.
JUMPSUB jump_target
jumps to an immediate subroutine address, given as four immediate bytes.
RETURNSUB
returns from the current subroutine to the instruction following the JUMPSUB that entered it.

These five simple instructions form the primitives of the proposal.

Data

In order to validate subroutines, EVM bytecode must be sequentially scanned matching jumps to their destinations. Since creation code has to contain the runtime code as data, that code might not correctly validate in the creation context and also does not have to be validated prior to the execution of the creation code. Because of that, there needs to be a way to place data into the bytecode that will be skipped over and not validated. Such data will prove useful for other purposes as well.

BEGINDATA
specifies that all of the following bytes to the end of the bytecode are data, and not reachable code.

Derivatives

We also propose two more instructions to provide for some complex but common uses of dynamic jumps: jumptables and indirection. We support these with vectors of offsets as immediate data, which can be selected with an index on the stack.

Jumptables are useful for dense switch statements, and are implemented as instructions similar to this one on most CPUs.

JUMPV n, jumpdest ...
jumps to one of a vector of n JUMPDEST offsets via a zero-based index on the stack. The vector is stored inline in the bytecode, MSB-first. If the index greater than or equal to n - 1 the last (default) offset is used. n is given as four immediate bytes, all JUMPDEST offsets as four immediate bytes each.

Dynamic jumps are used to implement indirection such as virtual functions and callbacks, which take just a couple of pointer dereferences in C.

JUMPSUBV n, beginsub ...
jumps to one of a vector of n BEGINSUB offsets via a zero-based index on the stack. The vector is stored inline in the bytecode, MSB-first. If the index greater than or equal to n - 1 the last (default) offset is used. n is given as four immediate bytes, the n offsets as four immediate bytes each.

JUMPV and JUMPSUBV are not strictly necessary. They provide O(1) operations that can be replaced by O(n) or O(log n) EVM code, but that code will be slower, larger and use more gas for things that can and should be fast, small, and cheap, and that are directly supported in WASM with br_table and call_indirect.

SEMANTICS

Jumps to and returns from subroutines are described here in terms of

the EVM data stack, usually just called “the stack”,
a return stack of JUMPSUB and JUMPSUBV offsets, and
a virtual stack of frame pointers (not needed at runtime).

We will adopt the following conventions to describe the machine state:

The program counter PC is (as usual) the byte offset of the currently executing instruction.
The stack pointer SP corresponds to the number of items on the stack - the stack size. As an offset it addresses the current top of the stack of data values, where new items are pushed.
The virtual frame pointer FP is set to SP - n_args at entry to the currently executing subroutine.
The stack items between the frame pointer and the current stack pointer are called the frame.
The current number of items in the frame, SP - FP, is the frame size.

Placing the frame pointer at the beginning of the arguments rather than the end is unconventional, but better fits our stack semantics and simplifies the remainder of the proposal.

Note that frame pointers and the frame pointer stack, being virtual, are only needed for the following descriptions of subroutine semantics, not for their actual implementation. Also, the return stack is internal to the subroutine mechanism, and not directly accessible to the program.

The first instruction of an array of EVM bytecode begins execution of a main routine with no arguments, SP and FP set to 0, and with one value on the return stack - code size - 1. (Executing the virtual byte of 0 after this offset causes an EVM to stop. Thus executing a RETURNSUB with no prior JUMPSUB or JUMBSUBV - that is, in the main routine - executes a STOP.)

Execution of a subroutine begins with JUMPSUB or JUMPSUBV, which

push PC on the return stack,
push FP on the virtual frame stack,
thus suspending execution of the current subroutine, and
set FP to SP - n_args, and
set PC to the specified BEGINSUB address,
thus beginning execution of the new subroutine.
(The main routine is not addressable by JUMPSUB instructions.)

Execution of a subroutine is suspended during and resumed after execution of nested subroutines, and ends upon encountering a RETURNSUB, which

sets FP to the top of the virtual frame stack and pops the stack, and
sets PC to top of the return stack and pops the stack
advances PC to the next instruction
thus resuming execution of the enclosing subroutine or main program.
A STOP or RETURN` also ends the execution of a subroutine.

VALIDITY

We would like consider EVM code valid if and only if no execution of the program can lead to an exceptional halting state. But we must and will validate code in linear time. So our validation does not consider the code’s data and computations, only its control flow and stack use. This means we will reject programs with invalid code paths, even if those paths cannot be executed at runtime. Most conditions can be validated, and will not need to be checked at runtime; the exceptions are sufficient gas and sufficient stack. So some false negatives and runtime checks are the price we pay for linear time validation.

Execution is as defined in the Yellow Paper - a sequence of changes in the EVM state. The conditions on valid code are preserved by state changes. At runtime, if execution of an instruction would violate a condition the execution is in an exceptional halting state. The yellow paper defines five such states.

1 Insufficient gas
2 More than 1024 stack items
3 Insufficient stack items
4 Invalid jump destination
5 Invalid instruction

We propose to expand and extend the Yellow Paper conditions to handle the new instructions we propose.

To handle the return stack we expand the conditions on stack size:

2a The size of the data stack does not exceed 1024.
2b The size of the return stack does not exceed 1024.

Given our more detailed description of the data stack we restate condition 3 - stack underflow - as

3 SP must be greater than or equal to FP

Since the various DUP and SWAP operations are formalized as taking items off the stack and putting them back on, this prevents DUP and SWAP from accessing data below the frame pointer, since taking too many items off of the stack would mean that SP is less than FP.

To handle the new jump instructions and subroutine boundaries we expand the conditions on jumps and jump destinations.

4a JUMPTO, JUMPTOIF, and JUMPV address only JUMPDEST instructions.
4b JUMPSUB and JUMPSUBV address only BEGINSUB instructions.
4c JUMP instructions do not address instructions outside of the subroutine they occur in.

We have two new conditions on execution to ensure consistent use of the stack by subroutines:

6 For JUMPSUB and JUMPSUBV the frame size is at least the n_args of the BEGINSUB(s) to jump to.
7 For RETURNSUB the frame size is equal to the n_results of the enclosing BEGINSUB.

Finally, we have one condition that prevents pathological uses of the stack:

8 For every instruction in the code the frame size is constant.

In practice, we must test at runtime for conditions 1 and 2 - sufficient gas and sufficient stack. We don’t know how much gas there will be, we don’t know how deep a recursion may go, and analysis of stack depth even for non-recursive programs is non-trivial.

All of the remaining conditions we validate statically.

VALIDATION

Validation comprises two tasks:

Checking that jump destinations are correct and instructions valid.
Checking that subroutines satisfy the conditions on control flow and stack use.

We sketch out these two validation functions in pseudo-C below. To simplify the presentation only the five primitives are handled (JUMPV and JUMPSUBV would just add more complexity to loop over their vectors), we assume helper functions for extracting instruction arguments from immediate data and managing the stack pointer and program counter, and some optimizations are forgone.

Validating jumps

Validating that jumps are to valid addresses takes two sequential passes over the bytecode - one to build sets of jump destinations and subroutine entry points, another to check that addresses jumped to are in the appropriate sets.

    bytecode[code_size]   // contains EVM bytecode to validate
    is_sub[code_size]     // is there a BEGINSUB at PC?
    is_dest[code_size]    // is there a JUMPDEST at PC?
    sub_for_pc[code_size] // which BEGINSUB is PC in?
    
    bool validate_jumps(PC)
    {
        current_sub = PC

        // build sets of BEGINSUBs and JUMPDESTs
        for (PC = 0; instruction = bytecode[PC]; PC = advance_pc(PC))
        {
            if instruction is invalid
                return false
            if instruction is BEGINDATA
                return true
            if instruction is BEGINSUB
                is_sub[PC] = true
                current_sub = PC
                sub_for_pc[PC] = current_sub
            if instruction is JUMPDEST
                is_dest[PC] = true
            sub_for_pc[PC] = current_sub
        }
        
        // check that targets are in subroutine
        for (PC = 0; instruction = bytecode[PC]; PC = advance_pc(PC))
        {
            if instruction is BEGINDATA
                break;
            if instruction is BEGINSUB
                current_sub = PC
            if instruction is JUMPSUB
                if is_sub[jump_target(PC)] is false
                    return false
            if instruction is JUMPTO or JUMPTOIF
                if is_dest[jump_target(PC)] is false
                    return false
            if sub_for_pc[PC] is not current_sub
                return false
       }
        return true
    }

Note that code like this is already run by EVMs to check dynamic jumps, including building the jump destination set every time a contract is run, and doing a lookup in the jump destination set before every jump.

Validating subroutines

This function can be seen as a symbolic execution of each subroutine in the EVM code, where only the effect of the instructions on the state being validated is computed. Thus the structure of this function is very similar to an EVM interpreter. This function can also be seen as an acyclic traversal of the directed graph formed by taking instructions as vertexes and sequential and branching connections as edges, checking conditions along the way. The traversal is accomplished via recursion, and cycles are broken by returning when a vertex which has already been visited is reached.

The basic approach is to start out calling validate_subroutine(i, 0, 0), where i is the first instruction in the EVM code. Within validate_subroutine() we process instructions sequentially, recursing when JUMP instructions are encountered. When a destination is reached that has been visited before, validate_subroutine() returns, thus breaking cycles. validate_subroutine() returns true if the subroutine is valid, false otherwise.

    bytecode[code_size]     // contains EVM bytecode to validate
    frame_size[code_size ]  // is filled with -1

    // we validate each subroutine individually, as if at top level
    // * PC is the offset in the code to start validating at
    // * return_pc is the top PC on return stack that RETURNSUB returns to
    // * at top level FP = 0, so SP is both the frame size and the stack size
    validate_subroutine(PC, return_pc, SP)
    {
        // traverse code sequentially, recurse for jumps
        while true
        {
            instruction = bytecode[PC]

            // if frame size set we have been here before
            if frame_size[PC] >= 0
            {
                // check for constant frame size
                if instruction is JUMPDEST
                    if SP != frame_size[PC]
                        return false

                // return to break cycle
                return true
            }
            frame_size[PC] = SP

            // effect of instruction on stack
            n_removed = removed_items(instructions)
            n_added = added_items(instruction)

            // check for stack underflow
            if SP < n_removed
                return false

            // net effect of removing and adding stack items
            SP -= n_removed
            SP += n_added

            // check for stack overflow
            if SP > 1024
                return false

            if instruction is STOP, RETURN, or SUICIDE
                return true	   

            // violates single entry
            if instruction is BEGINSUB
                 return false

            // return to top or from recursion to JUMPSUB
            if instruction is RETURNSUB
                break;

            if instruction is JUMPSUB
            {
                // check for enough arguments
                sub_pc = jump_target(PC)
                if SP < n_args(sub_pc)
                    return false
                return true
            }

            // reset PC to destination of jump
            if instruction is JUMPTO
            {
                PC = jump_target(PC)
                continue 
            }

            // recurse to jump to code to validate 
            if instruction is JUMPTOIF
            {
                if not validate_subroutine(jump_target(PC), return_pc, SP)
                    return false
            }

            // advance PC according to instruction
            PC = advance_pc(PC)
        }

        // check for right number of results
        if SP != n_results(return_pc)
            return false
        return true
    }

COSTS

All of the instructions are O(1) with a low constant, requiring just a few machine operations each, whereas a JUMP or JUMPI must do an O(log n) binary search of an array of JUMPDEST offsets before every jump. With the cost of JUMPI being high and the cost of JUMP being mid, I suggest the cost of JUMPV and JUMPSUBV should be mid, JUMPSUB and JUMPTOIF should be low, andJUMPTO should be verylow. Measurement will tell.

GETTING THERE FROM HERE

These changes would need to be implemented in phases at decent intervals:

1 If this EIP is accepted, invalid code should be deprecated. Tools should stop generating invalid code, users should stop writing it, and clients should warn about loading it.
2 A later hard fork would require clients to place only valid code on the block chain. Note that despite the fork old EVM code will still need to be supported indefinitely.

If desired, the period of deprecation can be extended indefinitely by continuing to accept code not versioned as new - but without validation. Since we must continue to run old code this is not technically difficult.

Implementation of this proposal need not be difficult, At the least, interpreters can simply be extended with the new opcodes and run unchanged otherwise. The new opcodes require only a stack for the return offsets and the few pushes, pops, and assignments described above. JIT code can use native calls. Further optimizations include minimizing runtime checks for exceptions and taking advantage of validated code wherever possible.

The text was updated successfully, but these errors were encountered:

chriseth · 2016-11-09T08:31:56Z

Could you please specify the exact requirements for efficient conversion from stack to register code? I'm asking because I think that this specification does not satisfy such requirements. You can easily convert existing code in the following way:
every jumpdest is replaced by endsub beginsub, every jump is replaced by call. My impression is that you have to further restrict the stack access during subs.

pirapira · 2016-11-09T10:13:00Z

JUMPTOIF still lets dynamic information leak into the control flow. With this, the deprecated JUMP can be implemented.

Poor man's JUMP:

DUP1
ISZERO
JUMPTOIF 0x00
DUP1
PUSH2 0x0030
EQ
JUMPTOIF 0x0030
DUP1
PUSH2 0x0040
EQ
JUMPTOIF 0x0040
... (for all jump destinations)

gcolvin · 2016-11-09T14:03:44Z

@chriseth I don't have exact requirements - just a sketch of an algorithm we discussed a while back. But the essentials were that the worst case DFS be linear-time, and that stack depths be the same when control returns to a jumpdest (or call). I think these are the requirements for any analysis wanting to validate or optimize use of the stack. I'm not quite sure what to make of your example - it's a limiting case. But it's not an exact conversion either, as the flow of control will be from the endsub back to the caller.

@pirapira I'm not sure I understand your example so early in the morning, but it looks like you are building a ladder of conditionals to replace a single dynamic jump, which will give you are longer program that can still be analysed in linear time.

I also see that condition 7 is too strong - it should refer to stack depth relative to the nearest enclosing beginsub. We may also need to prohibit entering a subroutine any way but calling it.

pirapira · 2016-11-09T14:22:25Z

@gcolvin If you can analyze programs with this gadget in linear time, actually you can also analyze dynamic jumps in linear time. Only one instance of this ladder is necessary in the bytecode, so this gadget consumes (number of JUMPDESTs * 4) instructions. And, jumping into this gadget for free.

gcolvin · 2016-11-09T14:27:36Z

I think you will find that if you apply your gadget to a program with quadratically bad control flow it will become quadratically bigger ;)

pirapira · 2016-11-09T14:29:57Z

I need to try then.

gcolvin · 2016-11-09T14:49:59Z

Worst case program alternates jumpdest, load from memory, and dynamic jumps. So O(N*N/3). Unpack the code into conditional jumps and you would need N of them for every jump.

pirapira · 2016-11-09T14:51:57Z

I can reuse the same package for every jump. I posted two files here https://gist.github.com/pirapira/cbd7686860c3ba79aa411d73eac7af73

original is a program for the current EVM. It keeps jumping to dynamically specified destinations (results of the BALANCE opcode are used as the destination).
translated is a program in the new instruction set, but does the same thing as original. Instead of JUMP, the translated program does JUMPTO [gadget]. The gadget then looks at the stack element and decides where to go.

The translation grows just proportionally.

gcolvin · 2016-11-09T14:56:17Z

You are a very clever man. Is the state of the VM otherwise maintained? Not the PC, obviously, but the stack contents?

pirapira · 2016-11-09T15:00:09Z

@gcolvin Right, I'm leaving one garbage stack element after every jump. I'll update the example.

pirapira · 2016-11-09T15:04:17Z

Done, now the code size almost triples.

chriseth · 2016-11-09T15:35:55Z

@gcolvin if you say that you want the analysis to be done in linear time, then I think it is important to say what kind of analysis you want done. Let's assume the analysis is not possible in linear time, then we are still well of by just forcing the bytecode to already contain the analysis result. We can generate optimized bytecode from that result and if some condition is not satisfied during a jump, we can throw an exception.

pirapira · 2016-11-09T15:43:54Z

@gcolvin if we say "JUMPDEST must specify the source of the jump", programs are linearly as long as the number of jump edges.

gcolvin · 2016-11-09T17:14:50Z

The practical question being whether Solidtiy can emit code that allows for fast analysis and optimization without these changes? It uses uses dynamic jumps. but in a regular, possible detectable way.

gcolvin · 2016-11-09T18:23:47Z

@chriseth Lots of analyses depends on traversing the control graph, so I'm not trying to specify a particular analysis, just the conditions for traversing the graph in linear time. I could probably express it more clearly in terms of symbolic execution.
I don't know what you mean by "forcing the bytecode to already contain the analysis result". If a condition is not met for traversal it is easy to just abandon optimizing that contract. It might be possible to back out of optimizing just a portion of the program. Either way, the code will still run, just more slowly.

axic · 2016-11-09T18:27:24Z

Wouldn't it make more sense separating the discussion of static jumps and subroutines?

Two existing opcodes - JUMP and JUMPI - need to be deprecated.

Wouldn't JUMPDEST also be deprecated in that case?

gcolvin · 2016-11-09T18:58:03Z

@axic JUMPDEST is still used for JUMPTO and JUMPTOIF. And dynamic jumps are used by Solidity for subroutines, so we need an alternative.

axic · 2016-11-09T18:59:31Z

Why do you need JUMPDEST if JUMPTO(IF) contains the jump location as an immediate parameter in the bytecode encoding?

dynamic jumps are used by Solidity for subroutines

How would it be dynamic if there is no dynamic JUMP opcode?

gcolvin · 2016-11-09T19:09:14Z

You don't need it for validation, like you do for dynamic jumps, but it makes analysis easier to know where the destinations are without having to examine every jump.
Solidity uses dynamic jumps now - we'd need subroutines to replace them.

gcolvin · 2016-11-10T01:31:43Z

@pirapira "JUMPDEST must specify the source of the jump" won't work for subroutines, which can be called from many places.

gcolvin · 2016-11-11T04:56:12Z

@chriseth Went back to my notes to find this sketch of something that could be done with code structured as above. I've edited a little, but it's not a fully worked out idea. Main points are that the code can be symbolically executed (which is a traversal of the control-flow graph) in linear time, and that unless the stack is constrained to have the same depth at basic block boundaries the assignment of stack slots to local variables breaks down. (If stack slots could have more than one type then matching stack depths would be a prerequisite to matching types.)

The basic idea is to symbolically execute the EVM code, using DFS to stop cycles.

We have an in-memory stack, which will turn into local variables, and a in-memory constant pool.
We have a symbolic stack that refers to the physical stack or constant pool, and maintains reference counts.
Push instructions find or make an entry in the constant pool and push a reference on the symbolic stack.
Swap and dup instructions swap or copy symbolic references - not items on the physical stack - and increment reference counts, but cause no code to be emitted.
Instructions that use data on the stack and/or put non-constant data onto the stack cause new code to be emitted that references relevant physical stack positions directly - as local variables.
Instructions that would modify swapped or duplicated items (i.e. items with non-zero reference counts) force the swap or copy to happen.
Instructions that move the stack pointer past swapped or duplicated items reduce the reference counts.

gcolvin · 2016-11-11T05:00:05Z

@chriseth One way to phrase the questions would be -
Does this proposal provide everything Solidity needs to generate code?
If so, would all valid Solidity programs be valid EVM programs under the conditions given?

gcolvin · 2016-11-11T16:57:34Z

@chriseth @vbuterin I think this is close to ready for EIPs. I've tried to clarify subroutine semantics. I've reviewed validity conditions with Vitalik. And I've added pseudocode for a validation algorithm that needs review.

gcolvin · 2016-11-11T16:58:13Z

    inititialize machine state PC = SP = FP = 0
    for every instruction
        maintain PC and SP state according to instruction
        if node already exists in tree for this PC
            in case of a JUPMPDEST
                if relative depth of stack (SP - FP) not the same as node's
                    return INVALID
        else
            create a node in tree with current PC, SP, and FP
            in case of a JUMPTO or JUMPTOIF
                recurse to follow jump code
                if recursion returns INVALID
                    return INVALID
            in case of a JUMPSUB _begin_sub_
                save machine state
                FP = SP
                recurse to follow subroutine code
                if recursion returns INVALID
                    return INVALID
                restore machine state
    return VALID

gcolvin · 2016-11-11T22:40:28Z

Note that this algorithm does not handle the rules about not nesting subroutines and not jumping in and out of subroutines - just stack alignment. Code for assuring nesting would be easy to add, or it can be handled in the first pass we need to build the JUMPDEST and BEGINSUB tables anyway. Detecting jumps in and out of routines might be harder. I have to wonder, since the stack depth must match, if these jumps are in fact OK for a machine language. When the code jumped into returns it will actually cause the routine that jumped into it to return, and the stacks will remain in a consistent state throughout. So a form of multiple entry points to a routine, or a place for common exit sequences that different routines can branch to, then return. Possibly useful, two less rules to specify, not worth the complications of forbidding?

gcolvin · 2016-11-14T02:25:52Z

@chriseth @vbuterin @axic @chfast @pirapira @wanderer
The most recently edited issue incorporates feedback so far. I am proposing fewer opcodes and fewer rules, and present a fairly complete validation algorithm.

chriseth · 2016-11-14T21:56:44Z

@gcolvin: Some suggestions:

could make it unnecessary to add 64 bit operations: But then, the gas costs of opcodes would be non-local, not sure if that is worth the savings of adding additional opcodes.
JUMP directly preceded by PUSH: We have to make the PUSH analysis anyway to detect where the next opcode is in the code, so I don't think that is much more expensive. Adding further dynamically sized opcodes like push (namely JUMPTO jumpdest) is another complication. If we decide to keep JUMPTO, the encoding needs to be further specified.
JUMPV and JUMPSUBV are currently not too useful, and I think we can easily live without. If we decide to keep them, the encoding should be further specified.
I assume that ENTERSUB depth means that there is a byte immediately following the ENTERSUB opcode, which is interpreted as depth?
What happens if control flow reaches a sub without explicitly jumping into it using JUMPSUB? For simplicity, I would now allow such code.
FP and SP: Those terms are introduced without much explanation.

Jumps to and returns from subroutines are handled with an implicit stack of return addresses (the "return stack"). JUMPSUB and JUMPSUBV push the address of the instruction following the program counter PC on the return stack and then set the PC address jumped to. RETURNSUB restores the PC to the value saved at the top of the return stack and pops the return stack. Note that the validity conditions below require that the height of the stack at ENTERSUB is equal to the height at the corresponding RETURNSUB, so there is no need to save and restore stack pointers.

Validity:

To be valid, EVM code must contain no JUMP or JUMPI codes (or at least all of them are preceded by PUSH instructions), and must meet these conditions:

1. Only the addresses of ENTERSUB instructions can be used by JUMPSUB or JUMPSUBV.
2. Only the addresses of JUMPDEST instructions can be used by JUMPTO or JUMPIF.

Let a "sub code path" be a path of execution that either starts at the start of the program or any `ENTERSUB` instruction and can take either of the branches in a `JUMPI` or `JUMPV` instruction (taking also branches into account that are impossible given the values of the conditions). The "stack height delta" of a sub code path is the stack height at its end minus the stack height at its start.

3. For any given `JUMPDEST`, any sub code path that ends in it must have the same stack height delta.
4. Any sub code path that ends in the `RETURNSUB` that corresponds to the `ENTERSUB` at its beginning must have zero stack height delta.
5. No sub code path can have a negative stack height delta.
6. The stack height delta of a sub code path cannot exceed 1024.

I do not really see how it is safe to run unvalidated code - either it can be deployed or not.
We should probably talk about the pseudocode on the phone :-)
As @pirapira already mentioned further up, a compiler can generate code with only logarithmic runtime overhead (in the number of possible jump destinations) to simulate a dynamic jump. I don't see how this is prevented by this proposal. If a minimum stack height during the execution of and no change in stack height between start and end of a sub is all we need, we can re-introduce dynamic jumps to subs and throw at compile time if the min stack height is not met.

I would like to add one further "feature":

provide a way to specify code that will not be validated and cannot be jumped into. This might already be part of the spec, depending on how it is interpreted, but I wanted to stress that we need this - for constant data and for constructor arguments. One edge case that might not be covered in the spec is code that ends in ENTERSUB (without depth) or code that contains invalid jumps but is never jumped to.

gcolvin · 2016-11-15T01:18:35Z

I don't know what you mean by "gas costs of opcodes would be non-local". It sounds like it means that optimizing bytecode is a bad thing.

So far as making do with PUSH dest JUMP rather than JUMPTO dest, the same with JUMPI, and not needing JUMPV or JUMPSUBV? That doesn't leave very much of this proposal but ENTERSUB, JUMPSUB, and RETURNSUB, and Solidity is already implementing those too.

I wasn't trying to prevent Yoichi's construction, just make it unnecessary. The problem isn't so much that we have dynamic jumps but that they are so unconstrained. Yoichi's construction - like JUMPV and JUMPSUBV, and unlike JUMP - doesn't branch to every possible destination unless you want it to. JUMPV and JUMPSUBV are just more clear, (especially for a human coder) with less code and less gas.

Are the validity rules you give more correct , or just written with the more complex sub code path concept so as avoid the need for a frame pointer on the return stack? I went with the frame pointer to simplify the rules, and because the implementation would need to maintain it to enforce the rules at runtime anyway. The frame pointer and return stack are not directly accessible, so I think the only visible difference to the programmer is whether RETURNSUB clears off the locals or they have to do so themselves. To get the latter behavior I think all that is needed is to describe RETURNSUB accordingly and add one rule, "The absolute depth of the stack (SP) on execution of a RETURNSUB must be the same as for the corresponding ENTERSUB."
.

gcolvin · 2016-11-15T01:28:27Z

It's safe to run unvalidated code (if we choose to allow it) only in the sense that it would deterministically fail at runtime the checks it would have failed at validation time.

gcolvin · 2016-11-15T01:37:41Z

I think the feature of "data in code" is independent of this proposal.

gcolvin · 2016-11-15T03:48:06Z

An ENTERSUB executed directly would have to either flow through or be invalid. If it flows through it at least needs to enforce its minimum stack depth - but then where to return to when you hit RETURNSUB? So just being invalid might be cleanest.

An empty ENTERSUB at the end would get treated like anything else at the end: as if it had trailing zeros. So it acts like ENTERSUB 0 STOP.

But for ENTERSUB depth and some others I haven't specified how to represent the immediate data. For stack offsets one byte would do, but full two bytes would cover the whole stack. For code offsets four full bytes would probably do, but a variable length scheme might be better.

gcolvin · 2016-11-15T04:22:29Z

If we really don't like introducing opcodes it can be made unnecessary as I described above. Estimate the computational complexity of traversing the code as the number of opcodes plus the number of addresses reachable by each JUMP and JUMPI. The latter we can overestimate as the number of JUMPDESTs when the JUMP or JUMPI is not preceded by a PUSH. If the cost of putting code on the blockchain then includes this computational complexity the problem of the unbounded cost of static analysis is limited by the funds needed to incur that cost.

chfast · 2016-11-17T11:59:34Z

I want to make a suggestion to replace proposed JUMPV with a more generic
JUMPSWITCH arg, (case0, dest0), (case1, dest1), ..., (caseN, destN).
It take single argument from the stack and jumps to the destination which case param matches the argument.
The instruction like this is similar to C switch statement and LLVM switch instruction. It will be also very helpful for solidity function dispatching.

Also JUMPIF can be implemented as JUMPSWITCH (0, else_dest).

The JUMPSUBV should be modified accordingly.

gcolvin · 2016-11-17T17:31:30Z

This makes sense as an addition. As a replacement it would hurt performance searching the switch table in the case where a simple vector access would suffice.

chfast · 2016-11-17T19:39:52Z

I can't see any useful case for a "simple vector access".

gcolvin · 2016-11-17T19:48:44Z

For switches where the cases are fairly dense you can implement the
switch directly as indexing by the case into the vector.

For other switches the EVM code isn't too bad?

And for JUMPSUBV the vector is like a C++ vtable. Gotta run now.

gcolvin · 2016-11-21T06:21:13Z

@chriseth I thought briefly that we had solved the "data in code problem" - neither symbolic nor real execution will traverse unreachable code, so that code could be anything. But that falls afoul of the problem that if an invalid jump lands in the middle of immediate data that just happens to look like a valid jump destination then the execution will go wrong. That's why before we can run code we have to scan it sequentially, skipping over immediate data and building a table of jump destinations. That scan will fail if it runs into data that is not EVM code.

However, the sequential scans depend on the assumption that the right amount of immediate data follows each PUSH. So it might not be a worse assumption that static jumps do not land in immediate data. But either assumption is disturbing - it will be possible to have invalid code that passes validation. That is a showstopper for code in an interpreter that assumes it is getting valid code.

gcolvin · 2016-11-22T05:24:34Z

@chrisheth @chfast Pulling this over from Gitter.

For JUMPSUBV, the problem is to provide a mechanism that efficiently branches out to a specific set of functions at runtime, like a vtable in c++, or even a callback function. EVM being a closed world every subroutine that a program might actually want to call at any given point can be known in advance, so the idea is that instead of a dynamic jump we can use an index to pick out what routine to call from a table of addresses that can be validated up front. Right now the only place to put a table into EVM code is as immediate data, so for JUMPSUBV that's where it goes. It would be better if - as in wasm and C++ - the tables could go elsewhere. So that would be a another use for the DATA opcode you want Christian, and if we decide to add that I would want a virtual call mechanism that uses it.

But failing that, JUMPSUBV is just shorthand for a JUMPV to a set of JUMPSUBs, and I'd gladly drop it from the proposal to keep the proposal minimal.

gcolvin · 2016-11-22T06:15:17Z

As for non-negative frame sizes, they may or may not be absolutely necessary, but the very notion of a negative frame size is strange. It means you have borrowed your callers' stack space. And I'm very nervous about allowing a routine to gobble up as much of its callers' stack frames as it wants (and even call other routines along the way) so long as it refills them with something - anything - before it returns. Especially when it's easy to check for negative sizes at validation time, and when I can't think of what real use they would be. It seems far more likely that negative frame sizes would indicate pretty bad programming errors.

gcolvin · 2016-11-22T08:48:07Z

@chriseth @chfast @pirapira A new draft.
Discussion of near-linear versus quadratic complexity is not quite right - it's more subtle than that. Too late to get into it now, but what's there is better than it was. Main substantive changes are a few more validity conditions and validation pseudocode. Some of the conditions are preexisting. Validation pseudocode is looking more like real code as I clean it up, not sure if I should back off to clear English, or give up and have Yoichi do it for real in 11 lines of ML ;)

chriseth · 2016-11-22T09:35:30Z

I think we should be more explicit about the goals of this proposals, e.g.:

Motivation of this proposal:
 - restrict the EVM to allow for fast compilation to machine code
   (current evmjit provides suboptimal results due to potentially complex jump patterns)
 - provide a VM for compilation from WASM

We achieve these goals by:
1. restricting dynamic jumps to a bare minimum
2. introducing explicit jumpsub and returnsub opcodes for subroutines
3. disallowing pathological use of the stack - subroutines may not
    reduce the stack height, different code paths may not have different stack heights

gcolvin · 2016-11-22T09:56:06Z

Thanks, @chriseth. This is much clearer than my statement.

chriseth · 2016-11-23T17:31:07Z

We are trying to continue editing the document here: https://docs.google.com/document/d/1m3SnoFDXBeXtzhIYJkNUsZDEDsOtn-Dj3jP7jJ1t1EU/edit

gcolvin · 2016-12-06T08:43:52Z

@chriseth @chfast @axic @wanderer @pirapira @Arachnid
New draft up above. Comments welcome on this thread or on the Google doc
https://docs.google.com/a/ethereum.org/document/d/1m3SnoFDXBeXtzhIYJkNUsZDEDsOtn-Dj3jP7jJ1t1EU/edit?usp=sharing

gcolvin · 2016-12-08T07:08:03Z

@chriseth @chfast @axic @wanderer @pirapira @Arachnid
New draft in response to comments. Google doc in same place.
Biggest change is that Pawel and Nick convinced me that multiple entries and jumps into and out of subroutines made static analysis too difficult. And Nick helped show me that graph-theoretic rules can't capture the necessary restrictions. So subroutine code must now be contiguous, spanning from BEGINSUB to the next BEGINSUB (or BEGINDATA), and BEGINDATA starts a data section that runs to end of the file. This makes it very easy to check for jumps being within bounds during the same pass that checks for jumps going to jumpdests.

gcolvin · 2016-12-09T05:42:46Z

@chfast The validation code is now as you suggested - including the boolean returns, since moving the test made the int returns unnecessary. It does look rather pretty if I may say so myself.
@chriseth I think this is ready to move to an EIP issue.
New Google doc started here: https://docs.google.com/document/d/1y6qKjnlQ3zlxWs3s16I_LcSfJVGDE0lQp3I7UZPAxWc/edit

chriseth · 2016-12-09T11:15:19Z

Wonderful! I reviewed it and commented on some details - I'm already really happy with the document.

chfast · 2016-12-09T12:09:09Z

I'm also very happy with the document. We should send it for a review to more peers.

I plan to prepare sam example codes that are short enough to be validated mentally using the algorithm.

How do you plan to publish it? Maybe we should do a PR this time (not a GitHub issue).

gcolvin · 2016-12-09T13:57:29Z

I'm advised an issue first is still the way to go, PR to follow soon after. Hudson is overhauling the EIP process, so we might be the first to use it.

gcolvin · 2016-12-10T03:53:27Z

@chriseth @chfast I added validation code for JUMPV and JUMPSUBV. I think I should take it out, as it doubles the size of validate_jumps() with some hairy logic of its own that obscures the simpler logic around it. I improved the text descriptions of these a bit. Let me know what you think.

gcolvin · 2016-12-13T05:13:22Z

Moved to ethereum/EIPs#184

gcolvin changed the title ~~Static jumps & subroutines for O(n) analysis & optimization of EVM code~~ Static Jumps and Subroutines for the EVM Dec 8, 2016

gcolvin changed the title ~~Static Jumps and Subroutines for the EVM~~ Subroutines and Static Jumps for the EVM Dec 13, 2016

gcolvin closed this as completed Jan 2, 2017

Subroutines and Static Jumps for the EVM #3404

Subroutines and Static Jumps for the EVM #3404

Comments

gcolvin commented Nov 9, 2016 • edited Loading

Abstract

MOTIVATION

THE PROBLEM

PROPOSAL

Preliminaries

Primitives

Data

Derivatives

SEMANTICS

VALIDITY

VALIDATION

Validating jumps

Validating subroutines

COSTS

GETTING THERE FROM HERE

chriseth commented Nov 9, 2016

pirapira commented Nov 9, 2016

gcolvin commented Nov 9, 2016

pirapira commented Nov 9, 2016

gcolvin commented Nov 9, 2016

pirapira commented Nov 9, 2016

gcolvin commented Nov 9, 2016

pirapira commented Nov 9, 2016

gcolvin commented Nov 9, 2016

pirapira commented Nov 9, 2016

pirapira commented Nov 9, 2016

chriseth commented Nov 9, 2016

pirapira commented Nov 9, 2016

gcolvin commented Nov 9, 2016

gcolvin commented Nov 9, 2016

axic commented Nov 9, 2016

gcolvin commented Nov 9, 2016

axic commented Nov 9, 2016 • edited Loading

gcolvin commented Nov 9, 2016 • edited Loading

gcolvin commented Nov 10, 2016

gcolvin commented Nov 11, 2016

gcolvin commented Nov 11, 2016

gcolvin commented Nov 11, 2016

gcolvin commented Nov 11, 2016 • edited Loading

gcolvin commented Nov 11, 2016

gcolvin commented Nov 14, 2016

chriseth commented Nov 14, 2016

gcolvin commented Nov 15, 2016 • edited Loading

gcolvin commented Nov 15, 2016 • edited Loading

gcolvin commented Nov 15, 2016

gcolvin commented Nov 15, 2016 • edited Loading

gcolvin commented Nov 15, 2016

chfast commented Nov 17, 2016

gcolvin commented Nov 17, 2016

chfast commented Nov 17, 2016

gcolvin commented Nov 17, 2016

gcolvin commented Nov 21, 2016

gcolvin commented Nov 22, 2016

gcolvin commented Nov 22, 2016

gcolvin commented Nov 22, 2016

chriseth commented Nov 22, 2016

gcolvin commented Nov 22, 2016

chriseth commented Nov 23, 2016

gcolvin commented Dec 6, 2016

gcolvin commented Dec 8, 2016

gcolvin commented Dec 9, 2016 • edited Loading

chriseth commented Dec 9, 2016

chfast commented Dec 9, 2016

gcolvin commented Dec 9, 2016

gcolvin commented Dec 10, 2016

gcolvin commented Dec 13, 2016

gcolvin commented Nov 9, 2016 •

edited

Loading

axic commented Nov 9, 2016 •

edited

Loading

gcolvin commented Nov 9, 2016 •

edited

Loading

gcolvin commented Nov 11, 2016 •

edited

Loading

gcolvin commented Nov 15, 2016 •

edited

Loading

gcolvin commented Nov 15, 2016 •

edited

Loading

gcolvin commented Nov 15, 2016 •

edited

Loading

gcolvin commented Dec 9, 2016 •

edited

Loading