DCIPs/EIPS/eip-3508.md

101 lines
7.8 KiB
Markdown
Raw Normal View History

---
eip: 3508
title: Transaction Data Opcodes
author: Alex Papageorgiou (@alex-ppg)
discussions-to: https://ethereum-magicians.org/t/eip-draft-transaction-data-opcodes/6017
status: Stagnant
type: Standards Track
category: Core
created: 2021-04-16
---
## Simple Summary
Provide access to original transaction data.
## Abstract
This EIP introduces the following three EVM instructions: `ORIGINDATALOAD`, `ORIGINDATASIZE`, and `ORIGINDATACOPY`.
These three instructions are meant to provide access to the original transaction's `data` payload, enabling a gas-efficient way of accessing large data payloads in cross-contract calls.
## Motivation
As the Ethereum development scene matures, more ambitious and complex features are introduced into smart contracts more often than not requiring the utilization of complex and at times large data structures. Given the inherent limits of the EVM, however, transporting large data structures in between contracts is a costly task that can at times lead to even futile scenarios whereby the gas consumption of such an operation is impossible to execute within the gas limit bounds as well as without sacrificing a large chunk of ETH to facilitate its gas cost.
The purpose of this EIP is to render these features viable by introducing a way via which multi-contract systems are able to access the same in-memory data source without necessarily transmitting the full payload between them.
This EIP enables elaborate smart contract features to become part of a larger call-chain by efficiently reading data from the original transaction payload rather than requiring the data to be passed in as call-level data. Its inclusion will mainly benefit advanced trustless schemes to manifest, such as efficient verification of Merkle Patricia trees validating the storage value of a particular Ethereum block or EVM-based layer 2 solutions.
A side-effect of this change is that smart contract systems relying entirely on origin data inherently guarantee that the data they receive has not been malformed by an intermediate smart contract call.
## Specification
### ORIGINDATALOAD (`0x47`), ORIGINDATASIZE (`0x48`) and ORIGINDATACOPY (`0x49`)
These instructions are meant to operate similarly to their call-prefixed counterparts with the exception that they instead operate on the original `data` of a transaction instead of the current call's data. In detail:
- ORIGINDATALOAD (`0x47`) performs similarly to CALLDATALOAD (`0x35`)
- ORIGINDATASIZE (`0x48`) performs similarly to CALLDATASIZE (`0x36`)
- ORIGINDATACOPY (`0x49`) performs similarly to CALLDATACOPY (`0x37`)
As the data is retrieved once again from the execution environment, the costs for the three instructions will be `G_verylow`, `G_base` and `G_base + G_verylow * (number of words copied, rounded up)` respectively.
The transaction data the `ORIGINDATA*` opcodes operate on will be equivalent to the `calldata` specified in the `args*` parameter to the nearest `AUTHCALL` (`0xf7`) up the stack. If there is no `AUTHCALL` in the stack then `ORIGINDATA*` will operate on the transaction's original `data` field.
This interaction ensures full compatibility with [EIP-3074](./eip-3074.md) and ensures that no form of discrimination is introduced back into the system by this EIP e.g. by contracts entirely relying on `ORIGINDATA*` and thus allowing only EOAs to supply data to them.
## Rationale
### AUTHCALL (`0xf7`) Interaction
The [EIP-3074](./eip-3074.md) that will be part of the London fork has introduced a new call instruction called `AUTHCALL` (`0xf7`) that will replace a transaction's `ORIGIN` (`0x32`) with the context variable `authorized`. The intention of `AUTHCALL` is to prevent discrimination between smart contracts and EOAs which `ORIGIN` initially facilitated and as a result, it is sensible also replace the values retrieved by the `ORIGINDATA*` opcodes to the ones used in the `AUTHCALL`.
### Naming Conventions
The `ORIGIN`-prefixed instructions attempted to conform to the existing naming convention of `CALL`-prefixed instructions given the existence of the `ORIGIN` (`0x32`) instruction which is equivalent to the `CALLER` (`0x33`) instruction but on the original transaction's context.
### Instruction Address Space
The instruction address space of the `0x30-0x3f` has been exhausted by calls that already provide information about the execution context of a call so a new range had to be identified that is suitable for the purposes of the EIP.
Given that the [EIP-1344](./eip-1344.md) `CHAINID` opcode was included at `0x46`, it made sense to include additional transaction-related data beyond it since the Chain ID is also included in transaction payloads apart from the blocks themselves, rendering the `0x46-0x4f` address space reserved for more transaction-related data that may be necessary in the future, such as the EOA's nonce.
### Gas Costs
The opcodes ORIGINDATALOAD (`0x47`), ORIGINDATASIZE (`0x48`), and ORIGINDATACOPY (`0x49`) essentially perform the same thing as opcodes CALLDATALOAD (`0x35`), CALLDATASIZE (`0x36`), and CALLDATACOPY (`0x37`) respectively and thus share the exact same gas costs.
### Instruction Space Pollution
One can argue that multiple new EVM instructions pollute the EVM instruction address space and could cause issues in assigning sensible instruction codes to future instructions. This particular issue was assessed and a methodology via which the raw RLP encoded transaction may be accessible to the EVM was ideated. This would _future-proof_ the new instruction set as it would be usable for other members of the transaction that may be desired to be accessible on-chain in the future, however, it would also cause a redundancy in the `ORIGIN` opcode.
## Backwards Compatibility
The EIP does not alter or adjust existing functionality provided by the EVM and as such, no known issues exist.
## Test Cases
TODO.
## Security Considerations
### Introspective Contracts
Atomically, the `ORIGINDATALOAD` and `ORIGINDATACOPY` values should be considered insecure as they can easily be spoofed by creating an entry smart contract with the appropriate function signature and arguments that consequently invokes other contracts within the call chain. In brief, one should always assume that `tx.data != calldata` and these instructions should not be used as an introspection tool alone.
### Denial-of-Service Attack
An initial concern that may arise from this EIP is the additional contextual data that must be provided at the software level of nodes to the EVM in order for it to be able to access the necessary data via the `ORIGINDATALOAD` and `ORIGINDATACOPY` instructions.
This would lead to an increase in memory consumption, however, this increase should be negligible if at all existent given that the data of a transaction should already exist in memory as part of its execution process; a step in the overall inclusion of a transaction within a block.
### Multi-Contract System Gas Reduction
Given that most complex smart contract systems deployed on Ethereum today rely on cross-contract interactions whereby values are passed from one contract to another via function calls, the `ORIGIN`-prefixed instruction set would enable a way for smart contract systems to acquire access to the original transaction data at any given step in the call chain execution which could result in cross-contract calls ultimately consuming less gas if the data passed between them is reduced as a side-effect of this change.
The gas reduction, however, would be an implementation-based optimization that would also be solely applicable for rudimentary memory arguments rather than storage-based data, the latter of which is most commonly utilized in these types of calls. As a result, the overall gas reduction observed by this change will be negligible for most implementations.
## Copyright
Copyright and related rights waived via [CC0](../LICENSE.md).