DCIPs/EIPS/eip-3668.md

411 lines
29 KiB
Markdown
Raw Normal View History

---
eip: 3668
title: "CCIP Read: Secure offchain data retrieval"
description: CCIP Read provides a mechanism to allow a contract to fetch external data.
author: Nick Johnson (@arachnid)
discussions-to: https://ethereum-magicians.org/t/durin-secure-offchain-data-retrieval/6728
status: Final
type: Standards Track
category: ERC
created: 2020-07-19
---
## Abstract
Contracts wishing to support lookup of data from external sources may, instead of returning the data directly, revert using `OffchainLookup(address sender, string[] urls, bytes callData, bytes4 callbackFunction, bytes extraData)`. Clients supporting this specification then make an RPC call to a URL from `urls`, supplying `callData`, and getting back an opaque byte string `response`. Finally, clients call the function specified by `callbackFunction` on the contract, providing `response` and `extraData`. The contract can then decode and verify the returned data using an implementation-specific method.
This mechanism allows for offchain lookups of data in a way that is transparent to clients, and allows contract authors to implement whatever validation is necessary; in many cases this can be provided without any additional trust assumptions over and above those required if data is stored onchain.
## Motivation
Minimising storage and transaction costs on Ethereum has driven contract authors to adopt a variety of techniques for moving data offchain, including hashing, recursive hashing (eg Merkle Trees/Tries) and L2 solutions. While each solution has unique constraints and parameters, they all share in common the fact that enough information is stored onchain to validate the externally stored data when required.
Thus far, applications have tended to devise bespoke solutions rather than trying to define a universal standard. This is practical - although inefficient - when a single offchain data storage solution suffices, but rapidly becomes impractical in a system where multiple end-users may wish to make use of different data storage and availability solutions based on what suits their needs.
By defining a common specification allowing smart contract to fetch data from offchain, we facilitate writing clients that are entirely agnostic to the storage solution being used, which enables new applications that can operate without knowing about the underlying storage details of the contracts they interact with.
Examples of this include:
- Interacting with 'airdrop' contracts that store a list of recipients offchain in a merkle trie.
- Viewing token information for tokens stored on an L2 solution as if they were native L1 tokens.
- Allowing delegation of data such as ENS domains to various L2 solutions, without requiring clients to support each solution individually.
- Allowing contracts to proactively request external data to complete a call, without requiring the caller to be aware of the details of that data.
## Specification
### Overview
Answering a query via CCIP read takes place in three steps:
1. Querying the contract.
2. Querying the gateway using the URL provided in (1).
3. Querying or sending a transaction to the contract using the data from (1) and (2).
In step 1, a standard blockchain call operation is made to the contract. The contract reverts with an error that specifies the data to complete the call can be found offchain, and provides the url to a service that can provide the answer, along with additional contextual information required for the call in step (3).
In step 2, the client calls the gateway service with the `callData` from the revert message in step (1). The gateway responds with an answer `response`, whose content is opaque to the client.
In step 3, the client calls the original contract, supplying the `response` from step (2) and the `extraData` returned by the contract in step (1). The contract decodes the provided data and uses it to validate the response and act on it - by returning information to the client or by making changes in a transaction. The contract could also revert with a new error to initiate another lookup, in which case the protocol starts again at step 1.
```
┌──────┐ ┌────────┐ ┌─────────────┐
│Client│ │Contract│ │Gateway @ url│
└──┬───┘ └───┬────┘ └──────┬──────┘
│ │ │
│ somefunc(...) │ │
├─────────────────────────────────────────────────►│ │
│ │ │
│ revert OffchainData(sender, urls, callData, │ │
│ callbackFunction, extraData) │ │
│◄─────────────────────────────────────────────────┤ │
│ │ │
│ HTTP request (sender, callData) │ │
├──────────────────────────────────────────────────┼────────────►│
│ │ │
│ Response (result) │ │
│◄─────────────────────────────────────────────────┼─────────────┤
│ │ │
│ callbackFunction(result, extraData) │ │
├─────────────────────────────────────────────────►│ │
│ │ │
│ answer │ │
│◄─────────────────────────────────────────────────┤ │
│ │ │
```
### Contract interface
A CCIP read enabled contract MUST revert with the following error whenever a function that requires offchain data is called:
```solidity
error OffchainLookup(address sender, string[] urls, bytes callData, bytes4 callbackFunction, bytes extraData)
```
`sender` is the address of the contract that raised the error, and is used to determine if the error was thrown by the contract the client called, or 'bubbled up' from a nested call.
`urls` specifies a list of URL templates to services (known as gateways) that implement the CCIP read protocol and can formulate an answer to the query. `urls` can be the empty list `[]`, in which case the client MUST specify the URL template. The order in which URLs are tried is up to the client, but contracts SHOULD return them in order of priority, with the most important entry first.
Each URL may include two substitution parameters, `{sender}` and `{data}`. Before a call is made to the URL, `sender` is replaced with the lowercase 0x-prefixed hexadecimal formatted `sender` parameter, and `data` is replaced by the the 0x-prefixed hexadecimal formatted `callData` parameter.
`callData` specifies the data to call the gateway with. This value is opaque to the client. Typically this will be ABI-encoded, but this is an implementation detail that contracts and gateways can standardise on as desired.
`callbackFunction` is the 4-byte function selector for a function on the original contract to which a callback should be sent.
`extraData` is additional data that is required by the callback, and MUST be retained by the client and provided unmodified to the callback function. This value is opaque to the client.
The contract MUST also implement a callback method for decoding and validating the data returned by the gateway. The name of this method is implementation-specific, but it MUST have the signature `(bytes response, bytes extraData)`, and MUST have the same return type as the function that reverted with `OffchainLookup`.
If the client successfully calls the gateway, the callback function specified in the `OffchainLookup` error will be invoked by the client, with `response` set to the value returned by the gateway, and `extraData` set to the value returned in the contract's `OffchainLookup` error. The contract MAY initiate another CCIP read lookup in this callback, though authors should bear in mind that the limits on number of recursive invocations will vary from client to client.
In a call context (as opposed to a transaction), the return data from this call will be returned to the user as if it was returned by the function that was originally invoked.
#### Example
Suppose a contract has the following method:
```solidity
function balanceOf(address addr) public view returns(uint balance);
```
Data for these queries is stored offchain in some kind of hashed data structure, the details of which are not important for this example. The contract author wants the gateway to fetch the proof information for this query and call the following function with it:
```solidity
function balanceOfWithProof(bytes calldata response, bytes calldata extraData) public view returns(uint balance);
```
One example of a valid implementation of `balanceOf` would thus be:
```solidity
function balanceOf(address addr) public view returns(uint balance) {
revert OffchainLookup(
address(this),
[url],
abi.encodeWithSelector(Gateway.getSignedBalance.selector, addr),
ContractName.balanceOfWithProof.selector,
abi.encode(addr)
);
}
```
Note that in this example the contract is returning `addr` in both `callData` and `extraData`, because it is required both by the gateway (in order to look up the data) and the callback function (in order to verify it). The contract cannot simply pass it to the gateway and rely on it being returned in the response, as this would give the gateway an opportunity to respond with an answer to a different query than the one that was initially issued.
#### Recursive calls in CCIP-aware contracts
When a CCIP-aware contract wishes to make a call to another contract, and the possibility exists that the callee may implement CCIP read, the calling contract MUST catch all `OffchainLookup` errors thrown by the callee, and revert with a different error if the `sender` field of the error does not match the callee address.
The contract MAY choose to replace all `OffchainLookup` errors with a different error. Doing so avoids the complexity of implementing support for nested CCIP read calls, but renders them impossible.
Where the possibility exists that a callee implements CCIP read, a CCIP-aware contract MUST NOT allow the default solidity behaviour of bubbling up reverts from nested calls. This is to prevent the following situation:
1. Contract A calls non-CCIP-aware contract B.
2. Contract B calls back to A.
3. In the nested call, A reverts with `OffchainLookup`.
4. Contract B does not understand CCIP read and propagates the `OffchainLookup` to its caller.
5. Contract A also propagates the `OffchainLookup` to its caller.
The result of this sequence of operations would be an `OffchainLookup` that looks valid to the client, as the `sender` field matches the address of the contract that was called, but does not execute correctly, as it only completes a nested invocation.
#### Example
The code below demonstrates one way that a contract may support nested CCIP read invocations. For simplicity this is shown using Solidity's try/catch syntax, although as of this writing it does not yet support catching custom errors.
```solidity
contract NestedLookup {
error InvalidOperation();
error OffchainLookup(address sender, string[] urls, bytes callData, bytes4 callbackFunction, bytes extraData);
function a(bytes calldata data) external view returns(bytes memory) {
try target.b(data) returns (bytes memory ret) {
return ret;
} catch OffchainLookup(address sender, string[] urls, bytes callData, bytes4 callbackFunction, bytes extraData) {
if(sender != address(target)) {
revert InvalidOperation();
}
revert OffchainLookup(
address(this),
urls,
callData,
NestedLookup.aCallback.selector,
abi.encode(address(target), callbackFunction, extraData)
);
}
}
function aCallback(bytes calldata response, bytes calldata extraData) external view returns(bytes memory) {
(address inner, bytes4 innerCallbackFunction, bytes memory innerExtraData) = abi.decode(extraData, (address, bytes4, bytes));
return abi.decode(inner.call(abi.encodeWithSelector(innerCallbackFunction, response, innerExtraData)), (bytes));
}
}
```
### Gateway Interface
The URLs returned by a contract may be of any schema, but this specification only defines how clients should handle HTTPS URLs.
Given a URL template returned in an `OffchainLookup`, the URL to query is composed by replacing `sender` with the lowercase 0x-prefixed hexadecimal formatted `sender` parameter, and replacing `data` with the the 0x-prefixed hexadecimal formatted `callData` parameter.
For example, if a contract returns the following data in an `OffchainLookup`:
```
urls = ["https://example.com/gateway/{sender}/{data}.json"]
sender = "0xaabbccddeeaabbccddeeaabbccddeeaabbccddee"
callData = "0x00112233"
```
The request URL to query is `https://example.com/gateway/0xaabbccddeeaabbccddeeaabbccddeeaabbccddee/0x00112233.json`.
If the URL template contains the `{data}` substitution parameter, the client MUST send a GET request after replacing the substitution parameters as described above.
If the URL template does not contain the `{data}` substitution parameter, the client MUST send a POST request after replacing the substitution parameters as described above. The POST request MUST be sent with a Content-Type of `application/json`, and a payload matching the following schema:
```
{
"type": "object",
"properties": {
"data": {
"type": "string",
"description": "0x-prefixed hex string containing the `callData` from the contract"
},
"sender": {
"type": "string",
"description": "0x-prefixed hex string containing the `sender` parameter from the contract"
}
}
}
```
Compliant gateways MUST respond with a Content-Type of `application/json`, with the body adhering to the following JSON schema:
```
{
"type": "object",
"properties": {
"data": {
"type": "string",
"description: "0x-prefixed hex string containing the result data."
}
}
}
```
Unsuccessful requests MUST return the appropriate HTTP status code - for example, 404 if the `sender` address is not supported by this gateway, 400 if the `callData` is in an invalid format, 500 if the server encountered an internal error, and so forth. If the Content-Type of a 4xx or 5xx response is `application/json`, it MUST adhere to the following JSON schema:
```
{
"type": "object",
"properties": {
"message": {
"type": "string",
"description: "A human-readable error message."
}
}
}
```
#### Examples
***GET request***
```
# Client returned a URL template `https://example.com/gateway/{sender}/{data}.json`
# Request
curl -D - https://example.com/gateway/0x226159d592E2b063810a10Ebf6dcbADA94Ed68b8/0xd5fa2b00.json
# Successful result
HTTP/2 200
content-type: application/json; charset=UTF-8
...
{"data": "0xdeadbeefdecafbad"}
# Error result
HTTP/2 404
content-type: application/json; charset=UTF-8
...
{"message": "Gateway address not supported."}
}
```
***POST request***
```
# Client returned a URL template `https://example.com/gateway/{sender}.json`
# Request
curl -D - -X POST -H "Content-Type: application/json" --data '{"data":"0xd5fa2b00","sender":"0x226159d592E2b063810a10Ebf6dcbADA94Ed68b8"}' https://example.com/gateway/0x226159d592E2b063810a10Ebf6dcbADA94Ed68b8.json
# Successful result
HTTP/2 200
content-type: application/json; charset=UTF-8
...
{"data": "0xdeadbeefdecafbad"}
# Error result
HTTP/2 404
content-type: application/json; charset=UTF-8
...
{"message": "Gateway address not supported."}
}
```
Clients MUST support both GET and POST requests. Gateways may implement either or both as needed.
### Client Lookup Protocol
A client that supports CCIP read MUST make contract calls using the following process:
1. Set `data` to the call data to supply to the contract, and `to` to the address of the contract to call.
2. Call the contract at address `to` function normally, supplying `data` as the input data. If the function returns a successful result, return it to the caller and stop.
3. If the function returns an error other than `OffchainLookup`, return it to the caller in the usual fashion.
4. Otherwise, decode the `sender`, `urls`, `callData`, `callbackFunction` and `extraData` arguments from the `OffchainLookup` error.
5. If the `sender` field does not match the address of the contract that was called, return an error to the caller and stop.
6. Construct a request URL by replacing `sender` with the lowercase 0x-prefixed hexadecimal formatted `sender` parameter, and replacing `data` with the the 0x-prefixed hexadecimal formatted `callData` parameter. The client may choose which URLs to try in which order, but SHOULD prioritise URLs earlier in the list over those later in the list.
7. Make an HTTP GET request to the request URL.
8. If the response code from step (5) is in the range 400-499, return an error to the caller and stop.
9. If the response code from step (5) is in the range 500-599, go back to step (5) and pick a different URL, or stop if there are no further URLs to try.
10. Otherwise, replace `data` with an ABI-encoded call to the contract function specified by the 4-byte selector `callbackFunction`, supplying the data returned from step (7) and `extraData` from step (4), and return to step (1).
Clients MUST handle HTTP status codes appropriately, employing best practices for error reporting and retries.
Clients MUST handle HTTP 4xx and 5xx error responses that have a content type other than application/json appropriately; they MUST NOT attempt to parse the response body as JSON.
This protocol can result in multiple lookups being requested by the same contract. Clients MUST implement a limit on the number of lookups they permit for a single contract call, and this limit SHOULD be at least 4.
The lookup protocol for a client is described with the following pseudocode:
```javascript
async function httpcall(urls, to, callData) {
const args = {sender: to.toLowerCase(), data: callData.toLowerCase()};
for(const url of urls) {
const queryUrl = url.replace(/\{([^}]*)\}/g, (match, p1) => args[p1]);
// First argument is URL to fetch, second is optional data for a POST request.
const response = await fetch(queryUrl, url.includes('{data}') ? undefined : args);
const result = await response.text();
if(result.statusCode >= 400 && result.statusCode <= 499) {
throw new Error(data.error.message);
}
if(result.statusCode >= 200 && result.statusCode <= 299) {
return result;
}
}
}
async function durin_call(provider, to, data) {
for(let i = 0; i < 4; i++) {
try {
return await provider.call(to, data);
} catch(error) {
if(error.code !== "CALL_EXCEPTION") {
throw(error);
}
const {sender, urls, callData, callbackFunction, extraData} = error.data;
if(sender !== to) {
throw new Error("Cannot handle OffchainLookup raised inside nested call");
}
const result = httpcall(urls, to, callData);
data = abi.encodeWithSelector(callbackFunction, result, extraData);
}
}
throw new Error("Too many CCIP read redirects");
}
```
Where:
- `provider` is a provider object that facilitates Ethereum blockchain function calls.
- `to` is the address of the contract to call.
- `data` is the call data for the contract.
If the function being called is a standard contract function, the process terminates after the original call, returning the same result as for a regular call. Otherwise, a gateway from `urls` is called with the `callData` returned by the `OffchainLookup` error, and is expected to return a valid response. The response and the `extraData` are then passed to the specified callback function. This process can be repeated if the callback function returns another `OffchainLookup` error.
### Use of CCIP read for transactions
While the specification above is for read-only contract calls (eg, `eth_call`), it is simple to use this method for sending transactions (eg, `eth_sendTransaction` or `eth_sendRawTransaction`) that require offchain data. While 'preflighting' a transaction using `eth_estimateGas` or `eth_call`, a client that receives an `OffchainLookup` revert can follow the procedure described above in [Client lookup protocol](#client-lookup-protocol), substituting a transaction for the call in the last step. This functionality is ideal for applications such as making onchain claims supported by offchain proof data.
### Glossary
- Client: A process, such as JavaScript executing in a web browser, or a backend service, that wishes to query a blockchain for data. The client understands how to fetch data using CCIP read.
- Contract: A smart contract existing on Ethereum or another blockchain.
- Gateway: A service that answers application-specific CCIP read queries, usually over HTTPS.
## Rationale
### Use of `revert` to convey call information
For offchain data lookup to function as desired, clients must either have some way to know that a function depends on this specification for functionality - such as a specifier in the ABI for the function - or else there must be a way for the contract to signal to the client that data needs to be fetched from elsewhere.
While specifying the call type in the ABI is a possible solution, this makes retrofitting existing interfaces to support offchain data awkward, and either results in contracts with the same name and arguments as the original specification, but with different return data - which will cause decoding errors for clients that do not expect this - or duplicating every function that needs support for offchain data with a different name (eg, `balanceOf -> offchainBalanceOf`). Neither solutions is particularly satisfactory.
Using a revert, and conveying the required information in the revert data, allows any function to be retrofitted to support lookups via CCIP read so long as the client understands the specification, and so facilitates translation of existing specifications to use offchain data.
### Passing contract address to the gateway service
`address` is passed to the gateway in order to facilitate the writing of generic gateways, thus reducing the burden on contract authors to provide their own gateway implementations. Supplying `address` allows the gateway to perform lookups to the original contract for information needed to assist with resolution, making it possible to operate one gateway for any number of contracts implementing the same interface.
### Existence of `extraData` argument
`extraData` allows the original contract function to pass information to a subsequent invocation. Since contracts are not persistent, without this data a contract has no state from the previous invocation. Aside from allowing arbitrary contextual information to be propagated between the two calls, this also allows the contract to verify that the query the gateway answered is in fact the one the contract originally requested.
### Use of GET and POST requests for the gateway interface
Using a GET request, with query data encoded in the URL, minimises complexity and enables entirely static implementations of gateways - in some applications a gateway can simply be an HTTP server or IPFS instance with a static set of responses in text files.
However, URLs are limited to 2 kilobytes in size, which will impose issues for more complex uses of CCIP read. Thus, we provide for an option to use POST data. This is made at the contract's discretion (via the choice of URL template) in order to preserve the ability to have a static gateway operating exclusively using GET when desired.
## Backwards Compatibility
Existing contracts that do not wish to use this specification are unaffected. Clients can add support for CCIP read to all contract calls without introducing any new overhead or incompatibilities.
Contracts that require CCIP read will not function in conjunction with clients that do not implement this specification. Attempts to call these contracts from non-compliant clients will result in the contract throwing an exception that is propagaged to the user.
## Security Considerations
### Gateway Response Data Validation
In order to prevent a malicious gateway from causing unintended side-effects or faulty results, contracts MUST include sufficient information in the `extraData` argument to allow them to verify the relevance and validity of the gateway's response. For example, if the contract is requesting information based on an `address` supplied to the original call, it MUST include that address in the `extraData` so that the callback can verify the gateway is not providing the answer to a different query.
Contracts must also implement sufficient validation of the data returned by the gateway to ensure it is valid. The validation required is application-specific and cannot be specified on a global basis. Examples would include verifying a Merkle proof of inclusion for an L2 or other Merkleized state, or verifying a signature by a trusted signer over the response data.
### Client Extra Data Validation
In order to prevent a malicious client from causing unintended effects when making transactions using CCIP read, contracts MUST implement appropriate checks on the `extraData` returned to them in the callback. Any sanity/permission checks performed on input data for the initial call MUST be repeated on the data passed through the `extraData` field in the callback. For example, if a transaction should only be executable by an authorised account, that authorisation check MUST be done in the callback; it is not sufficient to perform it with the initial call and embed the authorised address in the `extraData`.
### HTTP requests and fingerprinting attacks
Because CCIP read can cause a user's browser to make HTTP requests to an address controlled by the contract, there is the potential for this to be used to identify users - for example, to associate their wallet address with their IP address.
The impact of this is application-specific; fingerprinting a user when they resolve an ENS domain may have little privacy impact, as the attacker will not learn the user's wallet address, only the fact that the user is resolving a given ENS name from a given IP address - information they can also learn from running a DNS server. On the other hand, fingerprinting a user when they attempt a transaction to transfer an NFT may give an attacker everything they need to identify the IP address of a user's wallet.
To minimise the security impact of this, we make the following recommendations:
1. Client libraries should provide clients with a hook to override CCIP read calls - either by rewriting them to use a proxy service, or by denying them entirely. This mechanism or another should be written so as to easily facilitate adding domains to allowlists or blocklists.
2. Client libraries should disable CCIP read for transactions (but not for calls) by default, and require the caller to explicitly enable this functionality. Enablement should be possible both on a per-contract, per-domain, or global basis.
3. App authors should not supply a 'from' address for contract calls ('view' operations) where the call could execute untrusted code (that is, code not authored or trusted by the application author). As a precuationary principle it is safest to not supply this parameter at all unless the author is certain that no attacker-determined smart contract code will be executed.
4. Wallet authors that are responsible for fetching user information - for example, by querying token contracts - should either ensure CCIP read is disabled for transactions, and that no contract calls are made with a 'from' address supplied, or operate a proxy on their users' behalf, rewriting all CCIP read calls to take place via the proxy, or both.
We encourage client library authors and wallet authors not to disable CCIP read by default, as many applications can be transparently enhanced with this functionality, which is quite safe if the above precautions are observed.
## Copyright
Copyright and related rights waived via [CC0](../LICENSE.md).