DCIPs/EIPS/eip-3668.md

29 KiB

eip title description author discussions-to status type category created
3668 CCIP Read: Secure offchain data retrieval CCIP Read provides a mechanism to allow a contract to fetch external data. Nick Johnson (@arachnid) https://ethereum-magicians.org/t/durin-secure-offchain-data-retrieval/6728 Final Standards Track ERC 2020-07-19

Abstract

Contracts wishing to support lookup of data from external sources may, instead of returning the data directly, revert using OffchainLookup(address sender, string[] urls, bytes callData, bytes4 callbackFunction, bytes extraData). Clients supporting this specification then make an RPC call to a URL from urls, supplying callData, and getting back an opaque byte string response. Finally, clients call the function specified by callbackFunction on the contract, providing response and extraData. The contract can then decode and verify the returned data using an implementation-specific method.

This mechanism allows for offchain lookups of data in a way that is transparent to clients, and allows contract authors to implement whatever validation is necessary; in many cases this can be provided without any additional trust assumptions over and above those required if data is stored onchain.

Motivation

Minimising storage and transaction costs on Ethereum has driven contract authors to adopt a variety of techniques for moving data offchain, including hashing, recursive hashing (eg Merkle Trees/Tries) and L2 solutions. While each solution has unique constraints and parameters, they all share in common the fact that enough information is stored onchain to validate the externally stored data when required.

Thus far, applications have tended to devise bespoke solutions rather than trying to define a universal standard. This is practical - although inefficient - when a single offchain data storage solution suffices, but rapidly becomes impractical in a system where multiple end-users may wish to make use of different data storage and availability solutions based on what suits their needs.

By defining a common specification allowing smart contract to fetch data from offchain, we facilitate writing clients that are entirely agnostic to the storage solution being used, which enables new applications that can operate without knowing about the underlying storage details of the contracts they interact with.

Examples of this include:

  • Interacting with 'airdrop' contracts that store a list of recipients offchain in a merkle trie.
  • Viewing token information for tokens stored on an L2 solution as if they were native L1 tokens.
  • Allowing delegation of data such as ENS domains to various L2 solutions, without requiring clients to support each solution individually.
  • Allowing contracts to proactively request external data to complete a call, without requiring the caller to be aware of the details of that data.

Specification

Overview

Answering a query via CCIP read takes place in three steps:

  1. Querying the contract.
  2. Querying the gateway using the URL provided in (1).
  3. Querying or sending a transaction to the contract using the data from (1) and (2).

In step 1, a standard blockchain call operation is made to the contract. The contract reverts with an error that specifies the data to complete the call can be found offchain, and provides the url to a service that can provide the answer, along with additional contextual information required for the call in step (3).

In step 2, the client calls the gateway service with the callData from the revert message in step (1). The gateway responds with an answer response, whose content is opaque to the client.

In step 3, the client calls the original contract, supplying the response from step (2) and the extraData returned by the contract in step (1). The contract decodes the provided data and uses it to validate the response and act on it - by returning information to the client or by making changes in a transaction. The contract could also revert with a new error to initiate another lookup, in which case the protocol starts again at step 1.

┌──────┐                                          ┌────────┐ ┌─────────────┐
│Client│                                          │Contract│ │Gateway @ url│
└──┬───┘                                          └───┬────┘ └──────┬──────┘
   │                                                  │             │
   │ somefunc(...)                                    │             │
   ├─────────────────────────────────────────────────►│             │
   │                                                  │             │
   │ revert OffchainData(sender, urls, callData,      │             │
   │                     callbackFunction, extraData) │             │
   │◄─────────────────────────────────────────────────┤             │
   │                                                  │             │
   │ HTTP request (sender, callData)                  │             │
   ├──────────────────────────────────────────────────┼────────────►│
   │                                                  │             │
   │ Response (result)                                │             │
   │◄─────────────────────────────────────────────────┼─────────────┤
   │                                                  │             │
   │ callbackFunction(result, extraData)              │             │
   ├─────────────────────────────────────────────────►│             │
   │                                                  │             │
   │ answer                                           │             │
   │◄─────────────────────────────────────────────────┤             │
   │                                                  │             │

Contract interface

A CCIP read enabled contract MUST revert with the following error whenever a function that requires offchain data is called:

error OffchainLookup(address sender, string[] urls, bytes callData, bytes4 callbackFunction, bytes extraData)

sender is the address of the contract that raised the error, and is used to determine if the error was thrown by the contract the client called, or 'bubbled up' from a nested call.

urls specifies a list of URL templates to services (known as gateways) that implement the CCIP read protocol and can formulate an answer to the query. urls can be the empty list [], in which case the client MUST specify the URL template. The order in which URLs are tried is up to the client, but contracts SHOULD return them in order of priority, with the most important entry first.

Each URL may include two substitution parameters, {sender} and {data}. Before a call is made to the URL, sender is replaced with the lowercase 0x-prefixed hexadecimal formatted sender parameter, and data is replaced by the the 0x-prefixed hexadecimal formatted callData parameter.

callData specifies the data to call the gateway with. This value is opaque to the client. Typically this will be ABI-encoded, but this is an implementation detail that contracts and gateways can standardise on as desired.

callbackFunction is the 4-byte function selector for a function on the original contract to which a callback should be sent.

extraData is additional data that is required by the callback, and MUST be retained by the client and provided unmodified to the callback function. This value is opaque to the client.

The contract MUST also implement a callback method for decoding and validating the data returned by the gateway. The name of this method is implementation-specific, but it MUST have the signature (bytes response, bytes extraData), and MUST have the same return type as the function that reverted with OffchainLookup.

If the client successfully calls the gateway, the callback function specified in the OffchainLookup error will be invoked by the client, with response set to the value returned by the gateway, and extraData set to the value returned in the contract's OffchainLookup error. The contract MAY initiate another CCIP read lookup in this callback, though authors should bear in mind that the limits on number of recursive invocations will vary from client to client.

In a call context (as opposed to a transaction), the return data from this call will be returned to the user as if it was returned by the function that was originally invoked.

Example

Suppose a contract has the following method:

function balanceOf(address addr) public view returns(uint balance);

Data for these queries is stored offchain in some kind of hashed data structure, the details of which are not important for this example. The contract author wants the gateway to fetch the proof information for this query and call the following function with it:

function balanceOfWithProof(bytes calldata response, bytes calldata extraData) public view returns(uint balance);

One example of a valid implementation of balanceOf would thus be:

function balanceOf(address addr) public view returns(uint balance) {
    revert OffchainLookup(
        address(this),
        [url],
        abi.encodeWithSelector(Gateway.getSignedBalance.selector, addr),
        ContractName.balanceOfWithProof.selector,
        abi.encode(addr)
    );
}

Note that in this example the contract is returning addr in both callData and extraData, because it is required both by the gateway (in order to look up the data) and the callback function (in order to verify it). The contract cannot simply pass it to the gateway and rely on it being returned in the response, as this would give the gateway an opportunity to respond with an answer to a different query than the one that was initially issued.

Recursive calls in CCIP-aware contracts

When a CCIP-aware contract wishes to make a call to another contract, and the possibility exists that the callee may implement CCIP read, the calling contract MUST catch all OffchainLookup errors thrown by the callee, and revert with a different error if the sender field of the error does not match the callee address.

The contract MAY choose to replace all OffchainLookup errors with a different error. Doing so avoids the complexity of implementing support for nested CCIP read calls, but renders them impossible.

Where the possibility exists that a callee implements CCIP read, a CCIP-aware contract MUST NOT allow the default solidity behaviour of bubbling up reverts from nested calls. This is to prevent the following situation:

  1. Contract A calls non-CCIP-aware contract B.
  2. Contract B calls back to A.
  3. In the nested call, A reverts with OffchainLookup.
  4. Contract B does not understand CCIP read and propagates the OffchainLookup to its caller.
  5. Contract A also propagates the OffchainLookup to its caller.

The result of this sequence of operations would be an OffchainLookup that looks valid to the client, as the sender field matches the address of the contract that was called, but does not execute correctly, as it only completes a nested invocation.

Example

The code below demonstrates one way that a contract may support nested CCIP read invocations. For simplicity this is shown using Solidity's try/catch syntax, although as of this writing it does not yet support catching custom errors.

contract NestedLookup {
    error InvalidOperation();
    error OffchainLookup(address sender, string[] urls, bytes callData, bytes4 callbackFunction, bytes extraData);

    function a(bytes calldata data) external view returns(bytes memory) {
        try target.b(data) returns (bytes memory ret) {
            return ret;
        } catch OffchainLookup(address sender, string[] urls, bytes callData, bytes4 callbackFunction, bytes extraData) {
            if(sender != address(target)) {
                revert InvalidOperation();
            }
            revert OffchainLookup(
                address(this),
                urls,
                callData,
                NestedLookup.aCallback.selector,
                abi.encode(address(target), callbackFunction, extraData)
            );
        }
    }

    function aCallback(bytes calldata response, bytes calldata extraData) external view returns(bytes memory) {
        (address inner, bytes4 innerCallbackFunction, bytes memory innerExtraData) = abi.decode(extraData, (address, bytes4, bytes));
        return abi.decode(inner.call(abi.encodeWithSelector(innerCallbackFunction, response, innerExtraData)), (bytes));
    }
}

Gateway Interface

The URLs returned by a contract may be of any schema, but this specification only defines how clients should handle HTTPS URLs.

Given a URL template returned in an OffchainLookup, the URL to query is composed by replacing sender with the lowercase 0x-prefixed hexadecimal formatted sender parameter, and replacing data with the the 0x-prefixed hexadecimal formatted callData parameter.

For example, if a contract returns the following data in an OffchainLookup:

urls = ["https://example.com/gateway/{sender}/{data}.json"]
sender = "0xaabbccddeeaabbccddeeaabbccddeeaabbccddee"
callData = "0x00112233"

The request URL to query is https://example.com/gateway/0xaabbccddeeaabbccddeeaabbccddeeaabbccddee/0x00112233.json.

If the URL template contains the {data} substitution parameter, the client MUST send a GET request after replacing the substitution parameters as described above.

If the URL template does not contain the {data} substitution parameter, the client MUST send a POST request after replacing the substitution parameters as described above. The POST request MUST be sent with a Content-Type of application/json, and a payload matching the following schema:

{
    "type": "object",
    "properties": {
        "data": {
            "type": "string",
            "description": "0x-prefixed hex string containing the `callData` from the contract"
        },
        "sender": {
            "type": "string",
            "description": "0x-prefixed hex string containing the `sender` parameter from the contract"
        }
    }
}

Compliant gateways MUST respond with a Content-Type of application/json, with the body adhering to the following JSON schema:

{
    "type": "object",
    "properties": {
        "data": {
            "type": "string",
            "description: "0x-prefixed hex string containing the result data."
        }
    }
}

Unsuccessful requests MUST return the appropriate HTTP status code - for example, 404 if the sender address is not supported by this gateway, 400 if the callData is in an invalid format, 500 if the server encountered an internal error, and so forth. If the Content-Type of a 4xx or 5xx response is application/json, it MUST adhere to the following JSON schema:

{
    "type": "object",
    "properties": {
        "message": {
            "type": "string",
            "description: "A human-readable error message."
        }
    }
}

Examples

GET request

# Client returned a URL template `https://example.com/gateway/{sender}/{data}.json`
# Request
curl -D - https://example.com/gateway/0x226159d592E2b063810a10Ebf6dcbADA94Ed68b8/0xd5fa2b00.json

# Successful result
    HTTP/2 200
    content-type: application/json; charset=UTF-8
    ...
    
    {"data": "0xdeadbeefdecafbad"}

# Error result
    HTTP/2 404
    content-type: application/json; charset=UTF-8
    ...

    {"message": "Gateway address not supported."}
}

POST request

# Client returned a URL template `https://example.com/gateway/{sender}.json`
# Request
curl -D - -X POST -H "Content-Type: application/json" --data '{"data":"0xd5fa2b00","sender":"0x226159d592E2b063810a10Ebf6dcbADA94Ed68b8"}' https://example.com/gateway/0x226159d592E2b063810a10Ebf6dcbADA94Ed68b8.json

# Successful result
    HTTP/2 200
    content-type: application/json; charset=UTF-8
    ...
    
    {"data": "0xdeadbeefdecafbad"}

# Error result
    HTTP/2 404
    content-type: application/json; charset=UTF-8
    ...

    {"message": "Gateway address not supported."}
}

Clients MUST support both GET and POST requests. Gateways may implement either or both as needed.

Client Lookup Protocol

A client that supports CCIP read MUST make contract calls using the following process:

  1. Set data to the call data to supply to the contract, and to to the address of the contract to call.
  2. Call the contract at address to function normally, supplying data as the input data. If the function returns a successful result, return it to the caller and stop.
  3. If the function returns an error other than OffchainLookup, return it to the caller in the usual fashion.
  4. Otherwise, decode the sender, urls, callData, callbackFunction and extraData arguments from the OffchainLookup error.
  5. If the sender field does not match the address of the contract that was called, return an error to the caller and stop.
  6. Construct a request URL by replacing sender with the lowercase 0x-prefixed hexadecimal formatted sender parameter, and replacing data with the the 0x-prefixed hexadecimal formatted callData parameter. The client may choose which URLs to try in which order, but SHOULD prioritise URLs earlier in the list over those later in the list.
  7. Make an HTTP GET request to the request URL.
  8. If the response code from step (5) is in the range 400-499, return an error to the caller and stop.
  9. If the response code from step (5) is in the range 500-599, go back to step (5) and pick a different URL, or stop if there are no further URLs to try.
  10. Otherwise, replace data with an ABI-encoded call to the contract function specified by the 4-byte selector callbackFunction, supplying the data returned from step (7) and extraData from step (4), and return to step (1).

Clients MUST handle HTTP status codes appropriately, employing best practices for error reporting and retries.

Clients MUST handle HTTP 4xx and 5xx error responses that have a content type other than application/json appropriately; they MUST NOT attempt to parse the response body as JSON.

This protocol can result in multiple lookups being requested by the same contract. Clients MUST implement a limit on the number of lookups they permit for a single contract call, and this limit SHOULD be at least 4.

The lookup protocol for a client is described with the following pseudocode:

async function httpcall(urls, to, callData) {
    const args = {sender: to.toLowerCase(), data: callData.toLowerCase()};
    for(const url of urls) {
        const queryUrl = url.replace(/\{([^}]*)\}/g, (match, p1) => args[p1]);
        // First argument is URL to fetch, second is optional data for a POST request.
        const response = await fetch(queryUrl, url.includes('{data}') ? undefined : args);
        const result = await response.text();
        if(result.statusCode >= 400 && result.statusCode <= 499) {
            throw new Error(data.error.message);
        }
        if(result.statusCode >= 200 && result.statusCode <= 299) {
            return result;
        }
    }
}
async function durin_call(provider, to, data) {
    for(let i = 0; i < 4; i++) {
        try {
            return await provider.call(to, data);
        } catch(error) {
            if(error.code !== "CALL_EXCEPTION") {
                throw(error);
            }
            const {sender, urls, callData, callbackFunction, extraData} = error.data;
            if(sender !== to) {
                throw new Error("Cannot handle OffchainLookup raised inside nested call");
            }
            const result = httpcall(urls, to, callData);
            data = abi.encodeWithSelector(callbackFunction, result, extraData);
        }
    }
    throw new Error("Too many CCIP read redirects");
}

Where:

  • provider is a provider object that facilitates Ethereum blockchain function calls.
  • to is the address of the contract to call.
  • data is the call data for the contract.

If the function being called is a standard contract function, the process terminates after the original call, returning the same result as for a regular call. Otherwise, a gateway from urls is called with the callData returned by the OffchainLookup error, and is expected to return a valid response. The response and the extraData are then passed to the specified callback function. This process can be repeated if the callback function returns another OffchainLookup error.

Use of CCIP read for transactions

While the specification above is for read-only contract calls (eg, eth_call), it is simple to use this method for sending transactions (eg, eth_sendTransaction or eth_sendRawTransaction) that require offchain data. While 'preflighting' a transaction using eth_estimateGas or eth_call, a client that receives an OffchainLookup revert can follow the procedure described above in Client lookup protocol, substituting a transaction for the call in the last step. This functionality is ideal for applications such as making onchain claims supported by offchain proof data.

Glossary

  • Client: A process, such as JavaScript executing in a web browser, or a backend service, that wishes to query a blockchain for data. The client understands how to fetch data using CCIP read.
  • Contract: A smart contract existing on Ethereum or another blockchain.
  • Gateway: A service that answers application-specific CCIP read queries, usually over HTTPS.

Rationale

Use of revert to convey call information

For offchain data lookup to function as desired, clients must either have some way to know that a function depends on this specification for functionality - such as a specifier in the ABI for the function - or else there must be a way for the contract to signal to the client that data needs to be fetched from elsewhere.

While specifying the call type in the ABI is a possible solution, this makes retrofitting existing interfaces to support offchain data awkward, and either results in contracts with the same name and arguments as the original specification, but with different return data - which will cause decoding errors for clients that do not expect this - or duplicating every function that needs support for offchain data with a different name (eg, balanceOf -> offchainBalanceOf). Neither solutions is particularly satisfactory.

Using a revert, and conveying the required information in the revert data, allows any function to be retrofitted to support lookups via CCIP read so long as the client understands the specification, and so facilitates translation of existing specifications to use offchain data.

Passing contract address to the gateway service

address is passed to the gateway in order to facilitate the writing of generic gateways, thus reducing the burden on contract authors to provide their own gateway implementations. Supplying address allows the gateway to perform lookups to the original contract for information needed to assist with resolution, making it possible to operate one gateway for any number of contracts implementing the same interface.

Existence of extraData argument

extraData allows the original contract function to pass information to a subsequent invocation. Since contracts are not persistent, without this data a contract has no state from the previous invocation. Aside from allowing arbitrary contextual information to be propagated between the two calls, this also allows the contract to verify that the query the gateway answered is in fact the one the contract originally requested.

Use of GET and POST requests for the gateway interface

Using a GET request, with query data encoded in the URL, minimises complexity and enables entirely static implementations of gateways - in some applications a gateway can simply be an HTTP server or IPFS instance with a static set of responses in text files.

However, URLs are limited to 2 kilobytes in size, which will impose issues for more complex uses of CCIP read. Thus, we provide for an option to use POST data. This is made at the contract's discretion (via the choice of URL template) in order to preserve the ability to have a static gateway operating exclusively using GET when desired.

Backwards Compatibility

Existing contracts that do not wish to use this specification are unaffected. Clients can add support for CCIP read to all contract calls without introducing any new overhead or incompatibilities.

Contracts that require CCIP read will not function in conjunction with clients that do not implement this specification. Attempts to call these contracts from non-compliant clients will result in the contract throwing an exception that is propagaged to the user.

Security Considerations

Gateway Response Data Validation

In order to prevent a malicious gateway from causing unintended side-effects or faulty results, contracts MUST include sufficient information in the extraData argument to allow them to verify the relevance and validity of the gateway's response. For example, if the contract is requesting information based on an address supplied to the original call, it MUST include that address in the extraData so that the callback can verify the gateway is not providing the answer to a different query.

Contracts must also implement sufficient validation of the data returned by the gateway to ensure it is valid. The validation required is application-specific and cannot be specified on a global basis. Examples would include verifying a Merkle proof of inclusion for an L2 or other Merkleized state, or verifying a signature by a trusted signer over the response data.

Client Extra Data Validation

In order to prevent a malicious client from causing unintended effects when making transactions using CCIP read, contracts MUST implement appropriate checks on the extraData returned to them in the callback. Any sanity/permission checks performed on input data for the initial call MUST be repeated on the data passed through the extraData field in the callback. For example, if a transaction should only be executable by an authorised account, that authorisation check MUST be done in the callback; it is not sufficient to perform it with the initial call and embed the authorised address in the extraData.

HTTP requests and fingerprinting attacks

Because CCIP read can cause a user's browser to make HTTP requests to an address controlled by the contract, there is the potential for this to be used to identify users - for example, to associate their wallet address with their IP address.

The impact of this is application-specific; fingerprinting a user when they resolve an ENS domain may have little privacy impact, as the attacker will not learn the user's wallet address, only the fact that the user is resolving a given ENS name from a given IP address - information they can also learn from running a DNS server. On the other hand, fingerprinting a user when they attempt a transaction to transfer an NFT may give an attacker everything they need to identify the IP address of a user's wallet.

To minimise the security impact of this, we make the following recommendations:

  1. Client libraries should provide clients with a hook to override CCIP read calls - either by rewriting them to use a proxy service, or by denying them entirely. This mechanism or another should be written so as to easily facilitate adding domains to allowlists or blocklists.
  2. Client libraries should disable CCIP read for transactions (but not for calls) by default, and require the caller to explicitly enable this functionality. Enablement should be possible both on a per-contract, per-domain, or global basis.
  3. App authors should not supply a 'from' address for contract calls ('view' operations) where the call could execute untrusted code (that is, code not authored or trusted by the application author). As a precuationary principle it is safest to not supply this parameter at all unless the author is certain that no attacker-determined smart contract code will be executed.
  4. Wallet authors that are responsible for fetching user information - for example, by querying token contracts - should either ensure CCIP read is disabled for transactions, and that no contract calls are made with a 'from' address supplied, or operate a proxy on their users' behalf, rewriting all CCIP read calls to take place via the proxy, or both.

We encourage client library authors and wallet authors not to disable CCIP read by default, as many applications can be transparently enhanced with this functionality, which is quite safe if the above precautions are observed.

Copyright and related rights waived via CC0.