DCIPs/EIPS/eip-3670.md

141 lines
5.3 KiB
Markdown

---
eip: 3670
title: EOF - Code Validation
description: Validate EOF bytecode for correctness at the time of deployment.
author: Alex Beregszaszi (@axic), Andrei Maiboroda (@gumb0), Paweł Bylica (@chfast)
discussions-to: https://ethereum-magicians.org/t/eip-3670-eof-code-validation/6693
status: Review
type: Standards Track
category: Core
created: 2021-06-23
requires: 3540
---
## Abstract
Introduce code validation at contract creation time for EOF formatted ([EIP-3540](./eip-3540.md))
contracts. Reject contracts which contain truncated `PUSH`-data or undefined instructions.
Legacy bytecode (code which is not EOF formatted) is unaffected by this change.
## Motivation
Currently existing contracts require no validation of correctness and EVM implementations can decide
how they handle truncated bytecode or undefined instructions. This change aims to bring code
validity into consensus, so that it becomes easier to reason about bytecode.
Moreover, EVM implementations may require fewer paths to decide which instruction is valid in
the current execution context.
If it will be desired to introduce new instructions without bumping EOF version, having undefined
instructions already deployed would mean such contracts potentially can be broken (since some
instructions are changing their behaviour). Rejecting to deploy undefined instructions allows
introducing new instructions with or without bumping the EOF version.
### EOF1 forward compatibility
The EOF1 format provides following forward compatibility properties:
1. New instructions can be defined for previously unassigned opcodes. These instructions may have immediate values.
2. Mandatory EOF sections may be made optional.
3. New optional EOF sections may be introduced. They can be placed in any order in relation to previously defined sections.
## Specification
*Remark:* We rely on the notation of *initcode*, *code* and *creation* as defined by [EIP-3540](./eip-3540.md).
This feature is introduced on the very same block EIP-3540 is enabled, therefore every EOF1-compatible bytecode MUST be validated according to these rules.
1. Previously deprecated instructions `CALLCODE` (0xf2) and `SELFDESTRUCT` (0xff) are invalid and their opcodes are undefined.
2. At contract creation time *instructions validation* is performed on both *initcode* and *code*. The code is invalid if any of the checks below fails. For each instruction:
1. Check if the opcode is defined. The `INVALID` (0xfe) is considered defined.
2. Check if all instructions' immediate bytes are present in the code (code does not end in the middle of instruction).
## Rationale
### Immediate data
Allowing implicit zero immediate data for `PUSH` instructions introduces inefficiencies to EVM implementations without any practical use-case (the value of a `PUSH` instruction at the code end cannot be observed by EVM). This EIP requires all immediate bytes to be explicitly present in the code.
### Rejection of deprecated instructions
The deprecated instructions `CALLCODE` (0xf2) and `SELFDESTRUCT` (0xff) are removed from the `valid_opcodes` list to prevent their use in the future.
## Backwards Compatibility
This change poses no risk to backwards compatibility, as it is introduced at the same time EIP-3540 is. The validation does not cover legacy bytecode (code which is not EOF formatted).
## Test Cases
### Contract creation
Each case should be tested for creation transaction, `CREATE` and `CREATE2`.
- Invalid initcode
- Valid initcode returning invalid code
- Valid initcode returning valid code
### Valid codes
- EOF code containing `INVALID`
- EOF code with data section containing bytes that are undefined instructions
- Legacy code containing undefined instruction
- Legacy code ending with incomplete PUSH instruction
### Invalid codes
- EOF code containing undefined instruction
- EOF code ending with incomplete `PUSH` instruction
- This can include `PUSH` instruction unreachable by execution, e.g. after `STOP`
## Reference Implementation
```python
# The ranges below are as specified in the Yellow Paper.
# Note: range(s, e) excludes e, hence the +1
valid_opcodes = [
*range(0x00, 0x0b + 1),
*range(0x10, 0x1d + 1),
0x20,
*range(0x30, 0x3f + 1),
*range(0x40, 0x48 + 1),
*range(0x50, 0x5b + 1),
*range(0x60, 0x6f + 1),
*range(0x70, 0x7f + 1),
*range(0x80, 0x8f + 1),
*range(0x90, 0x9f + 1),
*range(0xa0, 0xa4 + 1),
# Note: 0xfe is considered assigned.
0xf0, 0xf1, 0xf3, 0xf4, 0xf5, 0xfa, 0xfd, 0xfe
]
immediate_sizes = 256 * [0]
immediate_sizes[0x60:0x7f + 1] = range(1, 32 + 1) # PUSH1..PUSH32
# Raises ValidationException on invalid code
def validate_instructions(code: bytes):
# Note that EOF1 already asserts this with the code section requirements
assert len(code) > 0
pos = 0
while pos < len(code):
# Ensure the opcode is valid
opcode = code[pos]
if opcode not in valid_opcodes:
raise ValidationException("undefined opcode")
# Skip immediate data
pos += 1 + immediate_sizes[opcode]
# Ensure last instruction's immediate doesn't go over code end
if pos != len(code):
raise ValidationException("truncated immediate")
```
## Security Considerations
See [Security Considerations of EIP-3540](./eip-3540.md#security-considerations).
## Copyright
Copyright and related rights waived via [CC0](../LICENSE.md).