forked from DecentralizedClimateFoundation/DCIPs
141 lines
5.3 KiB
Markdown
141 lines
5.3 KiB
Markdown
---
|
|
eip: 3670
|
|
title: EOF - Code Validation
|
|
description: Validate EOF bytecode for correctness at the time of deployment.
|
|
author: Alex Beregszaszi (@axic), Andrei Maiboroda (@gumb0), Paweł Bylica (@chfast)
|
|
discussions-to: https://ethereum-magicians.org/t/eip-3670-eof-code-validation/6693
|
|
status: Review
|
|
type: Standards Track
|
|
category: Core
|
|
created: 2021-06-23
|
|
requires: 3540
|
|
---
|
|
|
|
## Abstract
|
|
|
|
Introduce code validation at contract creation time for EOF formatted ([EIP-3540](./eip-3540.md))
|
|
contracts. Reject contracts which contain truncated `PUSH`-data or undefined instructions.
|
|
Legacy bytecode (code which is not EOF formatted) is unaffected by this change.
|
|
|
|
## Motivation
|
|
|
|
Currently existing contracts require no validation of correctness and EVM implementations can decide
|
|
how they handle truncated bytecode or undefined instructions. This change aims to bring code
|
|
validity into consensus, so that it becomes easier to reason about bytecode.
|
|
Moreover, EVM implementations may require fewer paths to decide which instruction is valid in
|
|
the current execution context.
|
|
|
|
If it will be desired to introduce new instructions without bumping EOF version, having undefined
|
|
instructions already deployed would mean such contracts potentially can be broken (since some
|
|
instructions are changing their behaviour). Rejecting to deploy undefined instructions allows
|
|
introducing new instructions with or without bumping the EOF version.
|
|
|
|
### EOF1 forward compatibility
|
|
|
|
The EOF1 format provides following forward compatibility properties:
|
|
|
|
1. New instructions can be defined for previously unassigned opcodes. These instructions may have immediate values.
|
|
2. Mandatory EOF sections may be made optional.
|
|
3. New optional EOF sections may be introduced. They can be placed in any order in relation to previously defined sections.
|
|
|
|
## Specification
|
|
|
|
*Remark:* We rely on the notation of *initcode*, *code* and *creation* as defined by [EIP-3540](./eip-3540.md).
|
|
|
|
This feature is introduced on the very same block EIP-3540 is enabled, therefore every EOF1-compatible bytecode MUST be validated according to these rules.
|
|
|
|
1. Previously deprecated instructions `CALLCODE` (0xf2) and `SELFDESTRUCT` (0xff) are invalid and their opcodes are undefined.
|
|
2. At contract creation time *instructions validation* is performed on both *initcode* and *code*. The code is invalid if any of the checks below fails. For each instruction:
|
|
1. Check if the opcode is defined. The `INVALID` (0xfe) is considered defined.
|
|
2. Check if all instructions' immediate bytes are present in the code (code does not end in the middle of instruction).
|
|
|
|
## Rationale
|
|
|
|
### Immediate data
|
|
|
|
Allowing implicit zero immediate data for `PUSH` instructions introduces inefficiencies to EVM implementations without any practical use-case (the value of a `PUSH` instruction at the code end cannot be observed by EVM). This EIP requires all immediate bytes to be explicitly present in the code.
|
|
|
|
### Rejection of deprecated instructions
|
|
|
|
The deprecated instructions `CALLCODE` (0xf2) and `SELFDESTRUCT` (0xff) are removed from the `valid_opcodes` list to prevent their use in the future.
|
|
|
|
## Backwards Compatibility
|
|
|
|
This change poses no risk to backwards compatibility, as it is introduced at the same time EIP-3540 is. The validation does not cover legacy bytecode (code which is not EOF formatted).
|
|
|
|
## Test Cases
|
|
|
|
### Contract creation
|
|
|
|
Each case should be tested for creation transaction, `CREATE` and `CREATE2`.
|
|
|
|
- Invalid initcode
|
|
- Valid initcode returning invalid code
|
|
- Valid initcode returning valid code
|
|
|
|
### Valid codes
|
|
|
|
- EOF code containing `INVALID`
|
|
- EOF code with data section containing bytes that are undefined instructions
|
|
- Legacy code containing undefined instruction
|
|
- Legacy code ending with incomplete PUSH instruction
|
|
|
|
### Invalid codes
|
|
|
|
- EOF code containing undefined instruction
|
|
- EOF code ending with incomplete `PUSH` instruction
|
|
- This can include `PUSH` instruction unreachable by execution, e.g. after `STOP`
|
|
|
|
## Reference Implementation
|
|
|
|
```python
|
|
# The ranges below are as specified in the Yellow Paper.
|
|
# Note: range(s, e) excludes e, hence the +1
|
|
valid_opcodes = [
|
|
*range(0x00, 0x0b + 1),
|
|
*range(0x10, 0x1d + 1),
|
|
0x20,
|
|
*range(0x30, 0x3f + 1),
|
|
*range(0x40, 0x48 + 1),
|
|
*range(0x50, 0x5b + 1),
|
|
*range(0x60, 0x6f + 1),
|
|
*range(0x70, 0x7f + 1),
|
|
*range(0x80, 0x8f + 1),
|
|
*range(0x90, 0x9f + 1),
|
|
*range(0xa0, 0xa4 + 1),
|
|
# Note: 0xfe is considered assigned.
|
|
0xf0, 0xf1, 0xf3, 0xf4, 0xf5, 0xfa, 0xfd, 0xfe
|
|
]
|
|
|
|
immediate_sizes = 256 * [0]
|
|
immediate_sizes[0x60:0x7f + 1] = range(1, 32 + 1) # PUSH1..PUSH32
|
|
|
|
|
|
# Raises ValidationException on invalid code
|
|
def validate_instructions(code: bytes):
|
|
# Note that EOF1 already asserts this with the code section requirements
|
|
assert len(code) > 0
|
|
|
|
pos = 0
|
|
while pos < len(code):
|
|
# Ensure the opcode is valid
|
|
opcode = code[pos]
|
|
if opcode not in valid_opcodes:
|
|
raise ValidationException("undefined opcode")
|
|
|
|
# Skip immediate data
|
|
pos += 1 + immediate_sizes[opcode]
|
|
|
|
# Ensure last instruction's immediate doesn't go over code end
|
|
if pos != len(code):
|
|
raise ValidationException("truncated immediate")
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
See [Security Considerations of EIP-3540](./eip-3540.md#security-considerations).
|
|
|
|
## Copyright
|
|
|
|
Copyright and related rights waived via [CC0](../LICENSE.md).
|