Skip to content

Commit d6e0255

Browse files
committed
Update encoding scheme
1 parent 462c8c3 commit d6e0255

File tree

1 file changed

+24
-37
lines changed

1 file changed

+24
-37
lines changed

spec/eofv0_verkle.md

+24-37
Original file line numberDiff line numberDiff line change
@@ -111,50 +111,34 @@ Alternate option is instead of encoding all valid `JUMPDEST` locations, to only
111111
By invalid `JUMPDEST` we mean a `0x5b` byte in any pushdata.
112112

113113
This is beneficial if our assumption is correct that most contracts only contain a limited number
114-
of offending cases. Our initial analysis suggests this is the case, e.g. Uniswap router has 9 cases,
115-
one of the Arbitrum validator contracts has 6 cases.
114+
of offending cases. Our initial analysis of the top 1000 used bytecodes suggests this is the case:
115+
only 0.07% of bytecode bytes are invalid jumpdests.
116116

117-
Since Solidity contracts have a trailing metadata, which contains a Keccak-256 (32-byte) hash of the
118-
source, there is a 12% probability ($1 - (255/256)^{32}$) that at least one of the bytes of the hash
119-
will contain the `0x5b` value, which gives our minimum probability of having at least one invalid
120-
`JUMPDEST` in the contract.
121-
122-
Let's create a map of `invalid_jumpdests[chunk_no] = first_instruction_offset`. We can densely encode this
123-
map using techniques similar to *run-length encoding* to skip distances and delta-encode offsets.
117+
Let's create a map of `invalid_jumpdests[chunk_index] = first_instruction_offset`. We can densely encode this
118+
map using techniques similar to *run-length encoding* to skip distances and delta-encode indexes.
124119
This map is always fully loaded prior to execution, and so it is important to ensure the encoded
125120
version is as dense as possible (without sacrificing on complexity).
126121

127-
In *scheme 1*, for each entry in `invalid_jumpdests`:
122+
We propose the encoding using fixed-size 8-bit elements.
123+
For each entry in `invalid_jumpdests`:
128124
- 1-bit mode (`skip`, `value`)
129125
- For skip-mode:
130-
- 10-bit number of chunks to skip
126+
- 7-bit number of chunks to skip
131127
- For value-mode:
132-
- 6-bit `first_instruction_offset`
128+
- 7-bit number combining number of chunks to skip `s` and `first_instruction_offset`
129+
produced as `s * 33 + first_instruction_offset`
133130

134-
Worst case encoding where each chunk contains an invalid `JUMPDEST`:
135-
```
136-
total_chunk_count = 24576 / 32 = 768
137-
total_chunk_count * (1 + 6) / 8 = 672 # bytes for the header, i.e. 2.7% overhead
138-
number_of_verkle_leafs = total_chunk_count / 32 = 21
139-
```
140-
141-
*Scheme 2* differs slightly:
142-
- 1-bit mode (`skip`, `value`)
143-
- For skip-mode:
144-
- 10-bit number of chunks to skip
145-
- For value-mode:
146-
- 4-bit number of chunks to skip
147-
- 6-bit `first_instruction_offset`
131+
For the worst case where each chunk contains an invalid `JUMPDEST` the encoding length is equal
132+
to the number of chunks in the code. I.e. the size overhead is 3.1%.
148133

149-
Worst case encoding:
150-
```
151-
total_chunk_count = 24576 / 32 = 768
152-
total_chunk_count * (1 + 4 + 6) / 8 = 1056 # bytes for the header, i.e. 4.1% overhead
153-
number_of_verkle_leafs = total_chunk_count / 32 = 33
154-
```
134+
| code size limit | code chunks | encoding chunks |
135+
|-----------------|-------------|-----------------|
136+
| 24576 | 768 | 24 |
137+
| 32768 | 1024 | 32 |
138+
| 65536 | 2048 | 64 |
155139

156-
The decision between *scheme 1* and *scheme 2*, as well as the best encoding sizes, can be determined
157-
through analysing existing code.
140+
Our current hunch is that in average contracts this results in a sub-1% overhead, while the worst case is 3.1%.
141+
This is strictly better than the 3.2% overhead of the current Verkle code chunking.
158142

159143
#### Header location
160144

@@ -165,9 +149,12 @@ This second option allows for the simplification of the `code_size` value, as it
165149

166150
#### Runtime after Verkle
167151

168-
During runtime execution two checks must be done in this order:
169-
1) Check if the destination is on the invalid list, and abort if so.
170-
2) Check if the value in the chunk is an actual `JUMPDEST`, and abort if not.
152+
During execution of a jump two checks must be done in this order:
153+
154+
1. Check if the jump destination is the `JUMPDEST` opcode.
155+
2. Check if the jump destination chunk is in the `invalid_jumpdests` map.
156+
If yes, the jumpdest analysis of the chunk must be performed
157+
to confirm the jump destination is not push data.
171158

172159
It is possible to reconstruct sparse account code prior to execution with all the submitted chunks of the transaction
173160
and perform `JUMPDEST`-validation to build up a relevant *valid `JUMPDEST` locations* map instead.

0 commit comments

Comments
 (0)