8.3 KiB
eip | title | description | status | type | category | author | discussions-to | created | requires |
---|---|---|---|---|---|---|---|---|---|
4573 | Procedures for the EVM | Introduces support for EVM Procedures. | Stagnant | Standards Track | Core | Greg Colvin (@gcolvin), Greg Colvin <greg@colvin.org> | https://ethereum-magicians.org/t/eip-4573-named-procedures-for-evm-code-sections/7776 | 2021-12-16 | 2315, 3540, 3670, 3779, 4200 |
Abstract
Five EVM instructions are introduced to define, call, and return from named EVM procedures and access their call frames in memory - ENTERPROC
, LEAVEPROC
, CALLPROC
, RETURNPROC
, and FRAMEADDRESS
.
Motivation
Currently, Ethereum bytecode has no syntactic structure, and subroutines have no defined interfaces.
We propose to add procedures -- delimited blocks of code that can be entered only by calling into them via defined interfaces.
Also, the EVM currently has no automatic management of memory for procedures. So we also propose to automatically reserve call frames on an in-memory stack.
Constraints on the use of procedures must be validated at contract initialization time to maintain the safety properties of EIP-3779: Valid programs will not halt with an exception unless they run out of gas or recursively overflow stack.
Prior Art
The terminology is not well-defined, but we will follow Intel in calling the low-level concept subroutines and the higher level concept procedures. The distinction is that subroutines are little more than a jump that knows where it came from, whereas procedures have a defined interface and manage memory as a stack. EIP-2315 introduces subroutines, and this EIP introduces procedures.
Specification
Instructions
ENTERPROC (0x??) dest_section: uint8, dest_offset: uint8, n_inputs: uint16, n_outputs: uint16, n_locals: uint16
frame_stack.push(FP)
FP -= n_locals * 32
PC +- <length of immediates>
Marks the entry point to a procedure
- at offset
dest_offset
from the beginning of thedest_section
. - taking
n_inputs
arguments from the data stack, - returning
n_outputs
values on thedata stack
, and - reserving
n_locals
words of data in memory on theframe stack
.
Procedures can only be entered via a CALLPROC
to their entry point.
LEAVEPROC (0x??)
FP = frame_stack.pop()
asm RETURNSUB
Pop the
frame stack
and return to the calling procedure usingRETURNSUB
.
Marks the end of a procedure. Each ENTERPROC
requires a closing LEAVEPROC
.
Note: Attempts to jump into a procedure (including its LEAVEPROC
) from outside of the procedure or to jump or step to ENTERPROC
at all must be prevented at validation time. CALLPROC
is the only valid way to enter a procedure.
CALLPROC (0x??) dest_section: uint16, dest_proc: uint16
FP -= n_locals
asm JUMPSUB <offset of section> + <offset of procedure>
Allocate a stack frame and transfer control and
JUMPSUB
to the Nth (N=dest_proc) procedure in the Mth(M=dest_section) section of the code. Section 0 is the current code section, any other code sections are indexed starting at 1.
Note: That the procedure is defined and the required n_inputs
words are available on the data stack
must be shown at validation time.
RETURNPROC (0x??)
FP += n_locals
asm RETURNSUB
Pop the
frame stack
and return control to the calling procedure usingRETURNSUB
.
Note: That the promised n_outputs
words are available on the data stack
must be shown at validation time.
FRAMEADDRESS (0x??) offset: int16
asm PUSH2 FP + offset
Push the address
FP + offset
onto the data stack.
Call frame data is addressed at an immediate offset
relative to FP
.
Typical usage includes storing data on a call frame
PUSH 0xdada
FRAMEADDRESS 32
MSTORE
and loading data from a call frame
FRAMEADDRESS 32
MLOAD
Memory Costs
Presently,MSTORE
is defined as
memory[stack[0]...stack[0]+31] = stack[1]
memory_size = max(memory_size,floor((stack[0]+32)÷32)
- where
memory_size
is the number of active words of memory above 0.
We propose to treat memory addresses as signed, so the formula needs to be
memory[stack[0]...stack[0]+31] = stack[1]
if (stack[0])+32)÷32) < 0
negative_memory_size = max(negative_memory_size,floor((stack[0]+32)÷32))
else
positive_memory_size = max(positive_memory_size,floor((stack[0]+32)÷32))
memory_size = positive_memory_size + negative_memory_size
- where
negative_memory_size
is the number of active words of memory below 0 and - where
positive_memory_size
is the number of active words of memory at or above 0.
Call Frame Stack
These instructions make use of a frame stack
to allocate and free frames of local data for procedures in memory. Frame memory begins at address 0 in memory and grows downwards, towards more negative addresses. A frame is allocated for each procedure when it is called, and freed when it returns.
Memory can be addressed relative to the frame pointer FP
or by absolute address. FP
starts at 0, and moves downward towards more negative addresses to point to the frame for each CALLPROC
and moving upward towards less negative addresses to point to the previous frame for the corresponding RETURNPROC
.
Equivalently, in the EVM's twos-complement arithmetic, FP
moves from the highest address down, as is common in many calling conventions.
For example, after an initial CALLPROC
to a procedure needing two words of data the frame stack
might look like this
0-> ........
........
FP->
Then, after a further CALLPROC
to a procedure needing three words of data the frame stack
would like this
0-> ........
........
-64-> ........
........
........
FP->
After a RETURNPROC
from that procedure the frame stack
would look like this
0-> ........
........
FP-> ........
........
........
and after a final RETURNPROC
, like this
FP-> ........
........
........
........
........
Rationale
There is actually not much new here. It amounts to EIP-615, refined and refactored into bite-sized pieces, along lines common to other machines.
This proposal uses the EIP-2315 return stack to manage calls and returns, and steals ideas from EIP-615, EIP-3336, and EIP-4200. ENTERPROC
corresponds to BEGINSUB
from EIP-615. Like EIP-615 it uses a frame stack to track call-frame addresses with FP
as procedures are entered and left, but like EIP-3336 and EIP-3337 it moves call frames from the data stack to memory.
Aliasing call frames with ordinary memory supports addressing call-frame data with ordinary stores and loads. This is generally useful, especially for languages like C that provide pointers to variables on the stack.
The design model here is the subroutines and procedures of the Intel x86 architecture.
JUMPSUB
andRETURNSUB
(from EIP-2315 -- likeCALL
andRET
-- jump to and return from subroutines.ENTERPROC
-- likeENTER
-- sets up the stack frame for a procedure.CALLPROC
amounts to aJUMPSUB
to anENTERPROC
.RETURNPROC
amounts to an earlyLEAVEPROC
.LEAVEPROC
-- likeLEAVE
-- takes down the stack frame for a procedure. It then executes aRETURNSUB
.
Backwards Compatibility
This proposal adds new EVM opcodes. It doesn't remove or change the semantics of any existing opcodes, so there should be no backwards compatibility issues.
Security
Safe use of these constructs must be checked completely at validation time -- per EIP-3779 -- so there should be no security issues at runtime.
ENTERPROC
and LEAVEPROC
must follow the same safety rules as for JUMPSUB
and RETURNSUB
in EIP-2315. In addition, the following constraints must be validated:
- Every
ENTERPROC
must be followed by aLEAVEPROC
to delimit the bodies of procedures. - There can be no nested procedures.
- There can be no jump into the body of a procedure (including its
LEAVEPROC
) from outside of that body. - There can be no jump or step to
BEGINPROC
at all -- onlyCALLPROC
. - The specified
n_inputs
andn_outputs
must be on the stack.
Copyright
Copyright and related rights waived via CC0.