196 lines
8.3 KiB
Markdown
196 lines
8.3 KiB
Markdown
---
|
|
eip: 4573
|
|
title: Procedures for the EVM
|
|
description: Introduces support for EVM Procedures.
|
|
status: Stagnant
|
|
type: Standards Track
|
|
category: Core
|
|
author: Greg Colvin (@gcolvin), Greg Colvin <greg@colvin.org>
|
|
discussions-to: https://ethereum-magicians.org/t/eip-4573-named-procedures-for-evm-code-sections/7776
|
|
created: 2021-12-16
|
|
requires: 2315, 3540, 3670, 3779, 4200
|
|
---
|
|
|
|
## Abstract
|
|
|
|
Five EVM instructions are introduced to define, call, and return from named EVM _procedures_ and access their _call frames_ in memory - `ENTERPROC`, `LEAVEPROC`, `CALLPROC`, `RETURNPROC`, and `FRAMEADDRESS`.
|
|
|
|
## Motivation
|
|
|
|
Currently, Ethereum bytecode has no syntactic structure, and _subroutines_ have no defined interfaces.
|
|
|
|
We propose to add _procedures_ -- delimited blocks of code that can be entered only by calling into them via defined interfaces.
|
|
|
|
Also, the EVM currently has no automatic management of memory for _procedures_. So we also propose to automatically reserve call frames on an in-memory stack.
|
|
|
|
Constraints on the use of _procedures_ must be validated at contract initialization time to maintain the safety properties of [EIP-3779](./eip-3779.md): Valid programs will not halt with an exception unless they run out of gas or recursively overflow stack.
|
|
|
|
### Prior Art
|
|
|
|
The terminology is not well-defined, but we will follow Intel in calling the low-level concept _subroutines_ and the higher level concept _procedures_. The distinction is that _subroutines_ are little more than a jump that knows where it came from, whereas procedures have a defined interface and manage memory as a stack. [EIP-2315](./eip-2315.md) introduces _subroutines_, and this EIP introduces _procedures_.
|
|
|
|
## Specification
|
|
|
|
### Instructions
|
|
|
|
#### ENTERPROC (0x??) dest_section: uint8, dest_offset: uint8, n_inputs: uint16, n_outputs: uint16, n_locals: uint16
|
|
```
|
|
frame_stack.push(FP)
|
|
FP -= n_locals * 32
|
|
PC +- <length of immediates>
|
|
```
|
|
Marks the entry point to a procedure
|
|
* at offset `dest_offset` from the beginning of the `dest_section`.
|
|
* taking `n_inputs` arguments from the data stack,
|
|
* returning `n_outputs` values on the `data stack`, and
|
|
* reserving `n_locals` words of data in memory on the `frame stack`.
|
|
|
|
Procedures can only be entered via a `CALLPROC` to their entry point.
|
|
|
|
#### LEAVEPROC (0x??)
|
|
|
|
```
|
|
FP = frame_stack.pop()
|
|
asm RETURNSUB
|
|
```
|
|
> Pop the `frame stack` and return to the calling procedure using `RETURNSUB`.
|
|
|
|
Marks the end of a procedure. Each `ENTERPROC` requires a closing `LEAVEPROC`.
|
|
|
|
*Note: Attempts to jump into a procedure (including its `LEAVEPROC`) from outside of the procedure or to jump or step to `ENTERPROC` at all must be prevented at validation time. `CALLPROC` is the only valid way to enter a procedure.*
|
|
|
|
#### CALLPROC (0x??) dest_section: uint16, dest_proc: uint16
|
|
```
|
|
FP -= n_locals
|
|
asm JUMPSUB <offset of section> + <offset of procedure>
|
|
```
|
|
> Allocate a *stack frame* and transfer control and `JUMPSUB` to the Nth (N=*dest_proc*) _procedure_ in the Mth(M=*dest_section*) _section_ of the code. _Section 0_ is the current code section, any other code sections are indexed starting at _1_.
|
|
|
|
*Note: That the procedure is defined and the required `n_inputs` words are available on the `data stack` must be shown at validation time.*
|
|
|
|
#### RETURNPROC (0x??)
|
|
```
|
|
FP += n_locals
|
|
asm RETURNSUB
|
|
```
|
|
> Pop the `frame stack` and return control to the calling procedure using `RETURNSUB`.
|
|
|
|
*Note: That the promised `n_outputs` words are available on the `data stack` must be shown at validation time.*
|
|
|
|
#### FRAMEADDRESS (0x??) offset: int16
|
|
```
|
|
asm PUSH2 FP + offset
|
|
```
|
|
> Push the address `FP + offset` onto the data stack.
|
|
|
|
Call frame data is addressed at an immediate `offset` relative to `FP`.
|
|
|
|
Typical usage includes storing data on a call frame
|
|
```
|
|
PUSH 0xdada
|
|
FRAMEADDRESS 32
|
|
MSTORE
|
|
```
|
|
and loading data from a call frame
|
|
```
|
|
FRAMEADDRESS 32
|
|
MLOAD
|
|
```
|
|
|
|
### Memory Costs
|
|
|
|
Presently,`MSTORE` is defined as
|
|
```
|
|
memory[stack[0]...stack[0]+31] = stack[1]
|
|
memory_size = max(memory_size,floor((stack[0]+32)÷32)
|
|
```
|
|
* where `memory_size` is the number of active words of memory above _0_.
|
|
|
|
We propose to treat memory addresses as signed, so the formula needs to be
|
|
```
|
|
memory[stack[0]...stack[0]+31] = stack[1]
|
|
if (stack[0])+32)÷32) < 0
|
|
negative_memory_size = max(negative_memory_size,floor((stack[0]+32)÷32))
|
|
else
|
|
positive_memory_size = max(positive_memory_size,floor((stack[0]+32)÷32))
|
|
memory_size = positive_memory_size + negative_memory_size
|
|
```
|
|
* where `negative_memory_size` is the number of active words of memory below _0_ and
|
|
* where `positive_memory_size` is the number of active words of memory at or above _0_.
|
|
|
|
### Call Frame Stack
|
|
|
|
These instructions make use of a `frame stack` to allocate and free frames of local data for _procedures_ in memory. Frame memory begins at address 0 in memory and grows downwards, towards more negative addresses. A frame is allocated for each procedure when it is called, and freed when it returns.
|
|
|
|
Memory can be addressed relative to the frame pointer `FP` or by absolute address. `FP` starts at 0, and moves downward towards more negative addresses to point to the frame for each `CALLPROC` and moving upward towards less negative addresses to point to the previous frame for the corresponding `RETURNPROC`.
|
|
|
|
Equivalently, in the EVM's twos-complement arithmetic, `FP` moves from the highest address down, as is common in many calling conventions.
|
|
|
|
For example, after an initial `CALLPROC` to a procedure needing two words of data the `frame stack` might look like this
|
|
|
|
```
|
|
0-> ........
|
|
........
|
|
FP->
|
|
```
|
|
Then, after a further `CALLPROC` to a procedure needing three words of data the `frame stack` would like this
|
|
|
|
```
|
|
0-> ........
|
|
........
|
|
-64-> ........
|
|
........
|
|
........
|
|
FP->
|
|
```
|
|
After a `RETURNPROC` from that procedure the `frame stack` would look like this
|
|
```
|
|
0-> ........
|
|
........
|
|
FP-> ........
|
|
........
|
|
........
|
|
```
|
|
and after a final `RETURNPROC`, like this
|
|
```
|
|
FP-> ........
|
|
........
|
|
........
|
|
........
|
|
........
|
|
```
|
|
|
|
## Rationale
|
|
|
|
There is actually not much new here. It amounts to [EIP-615](./eip-615.md), refined and refactored into bite-sized pieces, along lines common to other machines.
|
|
|
|
This proposal uses the [EIP-2315](./eip-2315.md) return stack to manage calls and returns, and steals ideas from [EIP-615](./eip-615.md), [EIP-3336](./eip-3336.md), and [EIP-4200](./eip-4200.md). `ENTERPROC` corresponds to `BEGINSUB` from EIP-615. Like EIP-615 it uses a frame stack to track call-frame addresses with `FP` as _procedures_ are entered and left, but like EIP-3336 and EIP-3337 it moves call frames from the data stack to memory.
|
|
|
|
Aliasing call frames with ordinary memory supports addressing call-frame data with ordinary stores and loads. This is generally useful, especially for languages like C that provide pointers to variables on the stack.
|
|
|
|
The design model here is the _subroutines_ and _procedures_ of the Intel x86 architecture.
|
|
* `JUMPSUB` and `RETURNSUB` (from [EIP-2315](./eip-2315.md) -- like `CALL` and `RET` -- jump to and return from _subroutines_.
|
|
* `ENTERPROC` -- like `ENTER` -- sets up the stack frame for a _procedure_.
|
|
* `CALLPROC` amounts to a `JUMPSUB` to an `ENTERPROC`.
|
|
* `RETURNPROC` amounts to an early `LEAVEPROC`.
|
|
* `LEAVEPROC` -- like `LEAVE` -- takes down the stack frame for a _procedure_. It then executes a `RETURNSUB`.
|
|
|
|
## Backwards Compatibility
|
|
|
|
This proposal adds new EVM opcodes. It doesn't remove or change the semantics of any existing opcodes, so there should be no backwards compatibility issues.
|
|
|
|
## Security
|
|
|
|
Safe use of these constructs must be checked completely at validation time -- per EIP-3779 -- so there should be no security issues at runtime.
|
|
|
|
`ENTERPROC` and `LEAVEPROC` must follow the same safety rules as for `JUMPSUB` and `RETURNSUB` in EIP-2315. In addition, the following constraints must be validated:
|
|
|
|
* Every`ENTERPROC` must be followed by a `LEAVEPROC` to delimit the bodies of _procedures_.
|
|
* There can be no nested _procedures_.
|
|
* There can be no jump into the body of a procedure (including its `LEAVEPROC`) from outside of that body.
|
|
* There can be no jump or step to `BEGINPROC` at all -- only `CALLPROC`.
|
|
* The specified `n_inputs` and `n_outputs` must be on the stack.
|
|
|
|
## Copyright
|
|
Copyright and related rights waived via [CC0](../LICENSE.md).
|