Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subroutines #2484

Merged
merged 7 commits into from
Feb 2, 2020
Merged

subroutines #2484

merged 7 commits into from
Feb 2, 2020

Conversation

gcolvin
Copy link
Contributor

@gcolvin gcolvin commented Jan 22, 2020

Specification

  • JUMPSUB Jumps to the address on top of the stack. This must be a JUMPDEST.
  • RETSUB Returns to the most recently executed JUMPSUB instruction.

Rationale
This is the smallest possible change that provides native subroutines without breaking backwards compatibility...

EIPS/eip-2315.md Outdated

**_JUMPSUB_**

* Jumps to the address on top of the stack. This must be a `JUMPDEST`.
Copy link
Contributor

@holiman holiman Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word address is ambiguous, at least to a casual reader. I'd suggest pc or offset to remove any ambiguity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha.

Copy link
Contributor Author

@gcolvin gcolvin Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to:

Jumps to the address on top of the stack, which must be the offset of a JUMPDEST.

'Address and program-counter are used with that meaning in the previous paragraph, so this is a good place for a reminder.

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments and questions

EIPS/eip-2315.md Outdated
* pushes a constant argument before every `JUMP`, `JUMPI` and `JUMPSUB`
* ensures that the _data stack_ is the same size at every execution of each `JUMP`, `JUMPI`, `JUMPSUB` or `JUMPDEST` operation.

is valid and will meet all of the [EIP-615](https://eips.ethereum.org/EIPS/eip-615) safety conditions. In particular, the Yellow paper defines five exceptional halting states.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may be (re the YP), but I'm not sure that is complete. This comes to mind too:

  • State-modification in a STATICCALL context

Also, for bullet 2 below, that should be split into two separate things:

  • More than 1024 stack items
  • More than 1024 call depth

Might also want to clarify that the Insufficient gas is something of a catch-all, where some other errors are defined as "this error behaves as if insufficient gas was provided", e.g. if you try to use RETURNDATACOPY out of bounds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm yanking this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's possible to lay out rules for using these ops to meet EIP-615's safety rules. But it's not necessary - and not worth the trouble - to lay them out here.

EIPS/eip-2315.md Outdated
4. Invalid jump destination
5. Invalid instruction

Valid code cannot violate states 3, 4, or 5.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite true, as written.

  • It can violate 4 if the constant being pushed is incorrect,
  • It can violate 3 if the code is simply POP (valid according to the rules above),
  • Same for bullet 5.

Maybe I totally misread what you meant. I interpret it as

Code that adheres to these two rules is 'valid'. Code that is 'valid' can never violate 3,4 or 5.

Maybe you meant

Code that adheres to these two rules and never violates 3,4,5 is valid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm yanking this.

EIPS/eip-2315.md Outdated
Comment on lines 50 to 53
PUSH 4 // before jumpsub: PC = 3
JUMPSUB // after jumpsub: PC = 4
JUMPDEST // before retsub: PC = 4
RETSUB // after retsub: PC = 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with your PC calculation, unless your PUSH is a PUSH2 which doesn't make sense. I suggest explicitly showing the PC and what type of PUSH it is instead of typing "before xx", "after xx"

00: PUSH1 3
02: JUMPSUB
03: JUMPDEST
04: RETSUB

To further clarify, I'd recommend adding a 'trace' aswell, so we see what happens when executing the code aboce. Example:

step, op, stack
0, PUSH1 3 , []
1, JUMPSUB, [3]
2, JUMPDEST, []
3, RETSUB, []
4, JUMPSUB [] ; At this point, it reaches stack-too-shallow, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Do we have a standard syntax for doing this? This one is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I had in mind.

step op       stack
0    PUSH1 3  []
1    JUMPSUB  [3]
4    STOP
2    JUMPDEST []
3    RETSUB   []

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't confuse step with pc, they're separate. step should be be always increasing. If you provide one disassembly with fields pc, op and one trace with step, pc, op, stack that would make it clearer.
And I'm wondering about the intent after the last step. Is the intent that RETSUB lands on the pc one step after the last JUMPSUB?

4 STOP
2 JUMPDEST []
3 RETSUB []
```
Copy link
Contributor

@holiman holiman Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how I interpret it.
Program:

pc op
0    PUSH1 3
2    JUMPSUB
3    JUMPDEST
4    RETSUB

Trace:

step pc op       stack
0    0   PUSH1 3  []
1    2   JUMPSUB  [3]
2    3   JUMPDEST []
3    4  RETSUB []
4    2 JUMPSUB [] // shallow stack -- translates to virtual `STOP`

I think I understood it now -- but I don't really like that you break an abstraction here -- previously, stack validation can be done without knowing exactly what the opcode is, only knowing how much it pops and how much it pushes. This EIP means that we'll have special handling on JUMPSUB, which instead of shallow-stack error should be treated as a virtual STOP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see update

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, now I'm confused again. The RETSUB would return to pc=2, the JUMPSUB. And that one would hit shallow stack, but not virtual STOP. So it would end execution on error, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just woke up. other duties. I might not be possible not to break something here. But then I doubt we have a list of every abstraction and assumption made by the current VM.

Copy link
Contributor Author

@gcolvin gcolvin Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three cups of coffee, @holiman. Getting closer?

step pc op       stack
0    1 PUSH1 4  []
1    2 JUMPSUB  [3]
2    4 JUMPDEST []
3    5 RETSUB   []
4    2 JUMPSUB  []
5    3 STOP

Notes:

  • at step 0 PUSH1 advances to 1 = ++PC and push(data_stack, 4)
  • at step 1 JUMPSUB jumps to 4 = PC = pop(data_stack), and push(return_stack, 2)
  • at step 2 JUMPDEST advances to 5 = ++PC
  • at step 3 RETSUB returns to 2 = PC = pop(return_stack)
  • at step 4 JUMPSUB advances to 3 = ++PC
  • at step 5 program STOPs with empty stack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what is confusing here is that in the typical implementation the ++PC is done at the end of the loop. So it's not exactly the case that JUMPSUB does that at step 4, although it is the case that RETSUB pops the return stack to PC at step3. It might be less confusing to say that RETSUB returns to 3 == PC = pop(return_stack) + 1, and adust the text and pseudocode to match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your JUMPSUB [3] should be JUMPSUB[4] or your PUSH1 4 needs a change. Could you also provide the program bytecode for that trace?

The spec for JUMPSUB is

Jumps to the address on top of the stack, which must be the offset of a JUMPDEST.

It does not mention any special casing of how empty stacks are handled, so IMO that means it errors out. The only time you mention a special casing is for the RETSUB:

The virtual byte of 0 at this offset is the EVM STOP opcode, so executing a RETSUB with no prior JUMPSUB executes a STOP. A STOP or RETURN ends the execution of the subroutine and the program.

It might be less confusing to say that RETSUB returns to 3 == PC = pop(return_stack) + 1

Yes, if the intention is to continue after the 'current' JUMPSUB, then that makes a lot more sense.

For extra clarity, I'd suggest making the program maybe

pc   op
0    PUSH1 5
2    JUMPSUB
3    CALLDATA
4    STOP
5    JUMPDEST
6    PUSH1 01
8    POP
9    RETSUB

IIUC, the trace would then go through PCs 0,2,5,6,8,9,3,4. Is that the intention?

## Specification

##### `JUMPSUB`
Jumps to the address on top of the stack, which must be the offset of a `JUMPDEST`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't mention the return address (the instruction after this JUMPSUB) is stored in a call stack.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an implementation detail, so long as RETSUB behaves as specified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, not exactly if we specify a call stack limit, then that limit must be checked here.

EIPS/eip-2315.md Outdated
Jumps to the address on top of the stack, which must be the offset of a `JUMPDEST`.

##### `RETSUB`
Returns to the most recently executed `JUMPSUB` instruction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many entries are allowed in the callstack?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1024 is a good number. Must be stated, you are right.

```
## Security Considerations

This proposal introduces no new security considerations to the EVM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of does, there are a couple of new things:

  • Program flow analysis frameworks need to be update for a new type of branching, since a RETSUB will cause a jump to a destination which is not a JUMPDEST.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could consider introducing the requirement of JUMPDEST after JUMPSUB?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering that, and think it's a good idea. Will simplify things that cause a long, confusing discussion above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, but am not so sure now that it's a good idea. Flow analysis frameworks will already have to change to allow for JUMPSUB, which is new kind of branching. And we still don't know statically where a RETSUB will go, except that it will for sure be the instruction after the most recent JUMPSUB.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, since you can't statically check the destination of a RETSUB anyway, it only figures as a basic-block terminator. So it doesn't seem like a big enough change to waste a byte after every JUMPSUB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't statically check jump destination currently, hence the reason for JUMPDEST which statically provides potential destinations. Following that logic, how is that bad for JUMPSUB?

```
bytecode[code_size]
data_stack[1024]
return_stack[1024]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add

push(return_stack, code_size)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks.

@gcolvin gcolvin merged commit fd78af9 into ethereum:master Feb 2, 2020
pizzarob pushed a commit to pizzarob/EIPs that referenced this pull request Jun 12, 2020
* subroutines

* redirect discussions to new magicians thread

* yet more concise

* thanks holiman

* thanks again holiman

* specify stack size, require jumpdest target for retsub, fix testcase and pseudocode
tkstanczak pushed a commit to tkstanczak/EIPs that referenced this pull request Nov 7, 2020
* subroutines

* redirect discussions to new magicians thread

* yet more concise

* thanks holiman

* thanks again holiman

* specify stack size, require jumpdest target for retsub, fix testcase and pseudocode
Arachnid pushed a commit to Arachnid/EIPs that referenced this pull request Mar 6, 2021
* subroutines

* redirect discussions to new magicians thread

* yet more concise

* thanks holiman

* thanks again holiman

* specify stack size, require jumpdest target for retsub, fix testcase and pseudocode
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants