Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subroutines #2484

Merged
merged 7 commits into from
Feb 2, 2020
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 78 additions & 0 deletions EIPS/eip-2315.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
---
eip: 2315
title: Simple Subroutines for the EVM
status: Draft
type: Standards Track
category: Core
author: Greg Colvin
discussions-to: https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for-the-evm/3941
created: 2019-10-17
---
## Abstract

This proposal introduces two opcodes to support subroutines: `JUMPSUB` and `RETSUB`.

## Motivation

The EVM does not provide subroutines as a primitive. Instead, calls must be synthesized by fetching and pushing the current program counter on the data stack and jumping to the subroutine address; returns must be synthesized by getting the return address to the top of stack and jumping back to it.

## Specification

##### `JUMPSUB`
Jumps to the address on top of the stack, which must be the offset of a `JUMPDEST`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't mention the return address (the instruction after this JUMPSUB) is stored in a call stack.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an implementation detail, so long as RETSUB behaves as specified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, not exactly if we specify a call stack limit, then that limit must be checked here.


##### `RETSUB`
Returns to the most recently executed `JUMPSUB` instruction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many entries are allowed in the callstack?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1024 is a good number. Must be stated, you are right.


## Rationale

This is the smallest possible change that provides native subroutines without breaking backwards compatibility.

## Backwards Compatibility

These changes do not affect the semantics of existing EVM code.

## Test Cases
```
step op stack
0 PUSH1 3 []
1 JUMPSUB [3]
4 STOP
2 JUMPDEST []
3 RETSUB []
```
Copy link
Contributor

@holiman holiman Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how I interpret it.
Program:

pc op
0    PUSH1 3
2    JUMPSUB
3    JUMPDEST
4    RETSUB

Trace:

step pc op       stack
0    0   PUSH1 3  []
1    2   JUMPSUB  [3]
2    3   JUMPDEST []
3    4  RETSUB []
4    2 JUMPSUB [] // shallow stack -- translates to virtual `STOP`

I think I understood it now -- but I don't really like that you break an abstraction here -- previously, stack validation can be done without knowing exactly what the opcode is, only knowing how much it pops and how much it pushes. This EIP means that we'll have special handling on JUMPSUB, which instead of shallow-stack error should be treated as a virtual STOP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see update

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, now I'm confused again. The RETSUB would return to pc=2, the JUMPSUB. And that one would hit shallow stack, but not virtual STOP. So it would end execution on error, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just woke up. other duties. I might not be possible not to break something here. But then I doubt we have a list of every abstraction and assumption made by the current VM.

Copy link
Contributor Author

@gcolvin gcolvin Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three cups of coffee, @holiman. Getting closer?

step pc op       stack
0    1 PUSH1 4  []
1    2 JUMPSUB  [3]
2    4 JUMPDEST []
3    5 RETSUB   []
4    2 JUMPSUB  []
5    3 STOP

Notes:

  • at step 0 PUSH1 advances to 1 = ++PC and push(data_stack, 4)
  • at step 1 JUMPSUB jumps to 4 = PC = pop(data_stack), and push(return_stack, 2)
  • at step 2 JUMPDEST advances to 5 = ++PC
  • at step 3 RETSUB returns to 2 = PC = pop(return_stack)
  • at step 4 JUMPSUB advances to 3 = ++PC
  • at step 5 program STOPs with empty stack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what is confusing here is that in the typical implementation the ++PC is done at the end of the loop. So it's not exactly the case that JUMPSUB does that at step 4, although it is the case that RETSUB pops the return stack to PC at step3. It might be less confusing to say that RETSUB returns to 3 == PC = pop(return_stack) + 1, and adust the text and pseudocode to match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your JUMPSUB [3] should be JUMPSUB[4] or your PUSH1 4 needs a change. Could you also provide the program bytecode for that trace?

The spec for JUMPSUB is

Jumps to the address on top of the stack, which must be the offset of a JUMPDEST.

It does not mention any special casing of how empty stacks are handled, so IMO that means it errors out. The only time you mention a special casing is for the RETSUB:

The virtual byte of 0 at this offset is the EVM STOP opcode, so executing a RETSUB with no prior JUMPSUB executes a STOP. A STOP or RETURN ends the execution of the subroutine and the program.

It might be less confusing to say that RETSUB returns to 3 == PC = pop(return_stack) + 1

Yes, if the intention is to continue after the 'current' JUMPSUB, then that makes a lot more sense.

For extra clarity, I'd suggest making the program maybe

pc   op
0    PUSH1 5
2    JUMPSUB
3    CALLDATA
4    STOP
5    JUMPDEST
6    PUSH1 01
8    POP
9    RETSUB

IIUC, the trace would then go through PCs 0,2,5,6,8,9,3,4. Is that the intention?

## Implementations

No clients have implemented this proposal as of yet.

The new operators proposed here are implemented by the following pseudocode, which adds cases for `JUMPSUB` and `RETSUB` to a simple loop-and-switch interpreter.
```
bytecode[code_size]
data_stack[1024]
return_stack[1024]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add

push(return_stack, code_size)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks.

while PC < code_size {
switch opcode = bytecode[PC] {
...
case JUMPSUB:
push(return_stack, PC)
PC = pop(data_stack)
case RETSUB:
PC = pop(return_stack)
...
}
++PC
}
```
Execution of EVM bytecode begins with one value on the return stack—the size of the bytecode. The virtual byte of 0 at this offset is the EVM `STOP` opcode, so executing a `RETSUB` with no prior `JUMPSUB` executes a `STOP`. A `STOP` or `RETURN` ends the execution of the subroutine and the program.

We suggest the cost of `JUMPSUB` should be _low_, and `RETSUB` should be _verylow_.
Measurement will tell. We suggest the following opcodes:
```
0xbe JUMPSUB
0xbf RETSUB
```
## Security Considerations

This proposal introduces no new security considerations to the EVM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of does, there are a couple of new things:

  • Program flow analysis frameworks need to be update for a new type of branching, since a RETSUB will cause a jump to a destination which is not a JUMPDEST.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could consider introducing the requirement of JUMPDEST after JUMPSUB?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering that, and think it's a good idea. Will simplify things that cause a long, confusing discussion above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, but am not so sure now that it's a good idea. Flow analysis frameworks will already have to change to allow for JUMPSUB, which is new kind of branching. And we still don't know statically where a RETSUB will go, except that it will for sure be the instruction after the most recent JUMPSUB.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, since you can't statically check the destination of a RETSUB anyway, it only figures as a basic-block terminator. So it doesn't seem like a big enough change to waste a byte after every JUMPSUB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't statically check jump destination currently, hence the reason for JUMPDEST which statically provides potential destinations. Following that logic, how is that bad for JUMPSUB?


**Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).**