Ethereum researcher: Native rollups—superpowers from L1 execution

Reprinted from jinse

01/23/2025·3M

Author: Ethereum researcher Justin Drake, ethresearch; Compiler: Tao Zhu, Golden Finance

Credit for this article goes to the broader Ethereum R&D community. Key contributions date from 2017, and the design has seen significant incremental unlocking over the years. Recent zkVM engineering breakthroughs have led to radical design space exploration. This article is just my best attempt at piecing together a coherent design for a big idea that might finally be here.

summary

We propose an elegant and powerful EXECUTE precompilation that exposes the native L1 EVM execution engine to the application layer. A native execution summary ("native summary") is a summary that uses EXECUTE to validate EVM state transitions for batches of user transactions. Native aggregation can be thought of as "programmable execution sharding", where precompilation is wrapped in derived functions to handle system logic outside the EVM, such as sorting, bridging, forced inclusion, and governance.

Since EXECUTE precompilation is performed directly by the validator, it enjoys (zk)EL client diversity and provides EVM equivalence that is bug-free in construction and consistent with EVM upgrades via L1 hard forks. former compatible. For EVM equivalent rollups that wish to fully inherit the security of Ethereum, a form of EVM introspection like EXECUTE precompilation is necessary. We call rollups that fully inherit the security of Ethereum "trustless rollups."

EXECUTE precompilation greatly simplifies the development of EVM equivalent summaries, as no complex infrastructure (e.g. fraud prevention games, SNARK circuits, safety committees) is required for EVM simulation and maintenance. With EXECUTE, minimal native aggregation and aggregation-based aggregation can be deployed with just a few lines of Solidity code using simple derived functions, eliminating the need for special handling of sorting, forced inclusion, or governance.

Best of all, native rollups can enjoy real-time settlement without worrying about real-time proofs, greatly simplifying synchronous composability.

This article is divided into two parts, first introducing the proposed precompilation and finally discussing native aggregation.

Part 1 - EXECUTE Precompilation

structure

EXECUTE precompilation accepts inputs pre_state_root, post_state_root, trace and gas_used. It returns true if and only if the following conditions are met:

trace is a well-formed execution trace (e.g. L2 transaction list and corresponding state access proof)
The stateless execution of trace starts from pre_state_root and ends at post_state_root
The stateless execution of trace consumes exactly gas_used gas

There is an EIP-1559-style mechanism for measuring and pricing the cumulative gas consumed by all EXECUTE calls in an L1 block. Specifically, there is a cumulative gas limit EXECUTE_CUMULATIVE_GAS_LIMIT, and a cumulative gas target EXECUTE_CUMULATIVE_GAS_TARGET. (When L1 EVM can be executed statelessly by validators, accumulation limits and targets can be merged with the L1 EIP-1559 mechanism.)

Calling precompilation costs a fixed amount of L1 gas, EXECUTE_GAS_COST, plus gas_used * gas_price, where gas_price (priced in ETH/gas) is set by an EIP-1559-style mechanism. Even if precompilation returns false, the full advance will be taken.

Traces must point to available Ethereum data from call data, blobs, state, or memory.

Re-execute

If EXECUTE_CUMULATIVE_GAS_LIMIT is small enough, the verifier can simply re- execute the trace to enforce correctness of the EXECUTE call. An initial deployment based on re-execution of the precompilation can be used as a stepping stone, similar to the simple re-download blob of the original danksharding to the full danksharding. Note that simple re-execution imposes no state growth or bandwidth overhead on the validator, and any execution overhead can be parallelized across CPU cores.

The validator must hold an explicit copy of the trace for re-execution, thus preventing the use of pointers to blob data sampled via DAS (rather than downloaded). Note that optimistic native aggregation may still publish summary data as blobs, falling back to call data only in fraud-proof games. Also note that optimistic native aggregation can have gas limits well in excess of EXECUTE_CUMULATIVE_GAS_LIMIT, since EXECUTE precompilation only needs to be called once on a small EVM segment to solve the fraud proof challenge.

For a historical note, in 2017 Vitalik proposed a similar "EVM inside EVM" precompilation called EXECTX.

Execute via SNARK

To unlock a larger EXECUTE_CUMULATIVE_GAS_LIMIT, it is natural for the verifier to selectively verify the SNARK proof. From now on, we assume a slot- delayed execution where invalid blocks (or invalid transactions) are treated as no-ops. (For more information on delayed execution, see this ethresearch post , this EIP , and this design by Francesco.) A slot-delayed execution yields a few seconds (the entire slot) for the proof. They also avoid incentivizing MEV-driven proof competitions, which would introduce centralized vectors.

Note that even if EXECUTE is enforced by a SNARK, there is no clear proof system or circuit being included in the consensus. (Note that EXECUTE precompilation does not take any explicit proofs as input.) Instead, each staking operator is free to choose their favorite zkEL validator client, similar to how EL clients are subjectively chosen today. The benefits of this design decision are explained in the next section, "Off-chain Proofs."

From now on, we assume that execution proposers are mature in the context of prover-proposer separation (APS) with alternating execution and consensus slots. To incentivize rational execution proposers to generate proofs in a timely manner (within 1 time slot), we require the prover to prove execution block n+1 only if a proof for execution block n is available. (We recommend bundling block n+1 with the EXECUTE proof of block n at the p2p layer.) Execution proposers who skip the proof may miss their slots, resulting in missed fees and MEV. We further impose a fixed penalty on missed execution slots, setting it high enough (e.g. 1 ETH) to always exceed the cost of the proof.

Note that in the context of APS, the generation of consensus blocks is not blocked by missed execution slots. However, it is important for light clients to generate proofs in a timely manner so that they can easily read the state on the chain side without the need for stateless re-execution. To ensure timely generation of proofs for light clients, even in the special case where the next executing proposer misses its slot, we rely on the altruistic minority prover assumption. A single altruistic prover is enough to generate a proof in 1 time slot. To avoid unnecessary redundant proofs, most altruistic provers can wait on standby and only start if no proofs arrive within 1 slot, thus acting as a failsafe for delays of up to 2 slots.

Note that EXECUTE_CUMULATIVE_GAS_LIMIT needs to be set low enough so that the altruistic minority prover assumption is credible (and so that executing the proposal does not unreasonably complicate it). A conservative strategy could be to set EXECUTE_CUMULATIVE_GAS_LIMIT so that laptops (such as high-end MacBook Pros) can access single-slot proofs. A more pragmatic and aggressive policy might be to target a small subset of GPUs, and possibly eventually SNARK ASIC provers once they are sufficiently commoditized.

Off-chain proof

To reiterate, we recommend not placing zkEL EXECUTE proofs on-chain, but sharing them off-chain. Not saving proofs is a good idea, first proposed by Vitalik, and it has several advantages:

Diversity: Validators are free to choose attestation validators (including attestation systems and circuits) from development teams they trust, similar to how validators choose EL clients they trust. This provides robustness through diversity. The zkEL validator client (and the underlying zkVM for some clients) is complex cryptographic software. A bug in any one client should not cause Ethereum to crash.
Neutrality: Having a zkEL validator client market allows the consensus layer to not choose a technology winner. For example, the zkVM market is highly competitive, and selecting a winning vendor (such as Risc0, Succinct, or many others) may not be considered neutral.
Simplicity: The consensus layer does not need to contain a specific SNARK validator, thus greatly simplifying the specification of the consensus layer. Only the format of the state access proof needs to be included, not specific proof validator implementation details.
Flexibility: If a bug or optimization is discovered, affected validators can update their clients without the need for a hard fork.

Having off-chain proofs does come with some manageable complications:

Proof load and p2p fragmentation: Since there is no single canonical proof, multiple proofs need to be generated (at least one per zkEL client). Each zkEL client customization (such as exchanging one RISC-V zkVM for another) requires a different certification. Likewise, each zkEL version upgrade requires a different certification. This will result in increased proof load. It would also further fragment the p2p network if there were separate gossip channels for each proof type.
Minority zkEL: It is difficult to incentivize a minority of zkEL to generate proofs. Rationally executing proposers may only generate enough proofs to reach a supermajority of provers without missing their slot. To solve this problem, staking operators can be socially encouraged to run multiple zkEL clients in parallel, similar to today's Vouch operators. Running a k-of-n setup has the added benefit of improving security, specifically preventing sanity vulnerabilities that would allow an attacker to craft proofs for arbitrary EXECUTE calls (a situation that is not possible with traditional EL clients). common).

Off-chain proofs will also reduce the efficiency of real-time settlement of L2:

No replacement DA: Since the tracking input of EXECUTE needs to be provided to the L1 validator, the live-settling L2 (i.e. the L2 that immediately updates its canonical state root) must consume L1 DA, that is, the rollup. Note that optimistic L2 with delayed settlement of the fraud-proof game does not have this restriction and can be a valid value.
State access overhead: Since tracing must be stateless executable, it must include state trie leaves for reading or writing, which introduces a small amount of DA overhead than a typical L2 block. Note that optimistic L2 does not have this restriction, since the state trie leaves are only required in fraud-proof challenges, and the challenger can recompute the trie leaves.
Stateless Differences: Since the attestation should be permissionless given the trace, aggregating state differences is not possible. However, stateless access proofs or EVM transaction signatures can be compressed if the corresponding specialized proofs are incorporated into the consensus.

RISC-V native execution

Given today's de facto convergence to RISC-V zkVM, there may be an opportunity to expose RISC-V state transitions natively to the EVM (similar to WASM in the Arbitrum Stylus environment) and remain SNARK-friendly.

Part 2 — Native Rollup

name

We first discuss the naming of native Rollups to resolve several confusing issues:

Alternative name: Native rollups were formerly called enshrined rollups. (The term "canonical rollup" was also briefly used in Polynya 12.) The term "enshrined" was later abandoned in favor of "native" to indicate that existing EVM equivalent rollups could optionally be upgraded to native. The name "Native" was independently proposed in November 2022 by Dan Robinson and a Lido contributor who wished to remain anonymous.
Aggregation-based: Aggregation-based and native aggregation are orthogonal concepts: "based" relates to L1 ordering, while "native" relates to L1 execution. Both based and native rollups are whimsically called "supersonic rollups."
Execution sharding: Execution sharding (i.e. enshrined copies of an L1 EVM chain) is a different but related concept to native rollup, predating native rollup by a few years. (Execution sharding was previously "Phase 2" of the Ethereum 2.0 roadmap.) Unlike native rollup, execution sharding is not programmable, meaning there are no options for custom governance, custom sorting, custom gas tokens, etc. Execution shards are also typically instantiated in a fixed number (such as 64 or 1,024 shards). Unfortunately, Martin Köppelmann used the term "native L2" in his talk on performing sharding at Devcon 20247.

benefit

Native Rollups have several benefits, which we’ll detail below:

Simplicity: Much of the complexity of a native rollup VM can be encapsulated through precompilation. Today's optimization and zk-rollup equivalents of the EVM have thousands of lines of code for their fraud proof games or SNARK verifiers that can be compressed into a single line of code. Native rollup also does not require supporting infrastructure such as proof networks, watchtowers, and security committees.
Security: Building a bug-free EVM fraud-proof game or SNARK verifier is a very difficult engineering task that may require deep formal verification. Today, every optimization and zk EVM rollup is likely to have a critical vulnerability in its EVM state transition function. To guard against vulnerabilities, centralized ordering is often used as a crutch to control adversarial block production. Native execution of precompilation allows safe deployment of permissionless sorting. Trustless rollups that fully inherit L1 security also fully inherit L1 asset fungibility.
EVM equivalence: Today, the only way for rollups to keep up with L1 EVM rules is for governance (usually the security committee and/or governance token) to mirror L1 EVM upgrades. (EVM updates still occur regularly via hard forks about once a year.) Not only is governance an attack vector, it strictly speaking deviates from L1 EVM and prevents any rollup from achieving true long-term EVM equivalence. On the other hand, native rollup can be upgraded simultaneously with L1 without management.
SNARK gas cost: The cost of verifying SNARKs on the chain is high. Therefore, many zk-rollups settle rarely to minimize costs. Since SNARKs are not verified on-chain, EXECUTE precompilation can be used to reduce verification costs. If you are using SNARK recursion to batch EXECUTE proofs for multiple calls in a block, EXECUTE_GAS_COST can be set relatively low.
Synchronous composability: Today, synchronous composability with L1 requires same-slot real-time proof. Achieving ultra-low latency proofs (e.g. around 100 milliseconds) is a particularly challenging engineering task for zk rollups. Using a single-slot delayed state root, the proof latency of native execution precompilation can be relaxed to a full slot.