refactor(tests): General benchmark folder restructuring #1681

LouisTsai-Csie · 2025-10-25T15:15:23Z

🗒️ Description

Folder Restructure Proposal:

.
├── compute
│   ├── instructions
│   │   ├── arithmetic
│   │   ├── bitwise
│   │   ├── block
│   │   ├── comparison
│   │   ├── control_flow
│   │   ├── environment
│   │   ├── keccak
│   │   ├── log
│   │   ├── memory
│   │   ├── stack
│   │   ├── storage
│   │   └── system
│   ├── precompile
│   └── scenarios
│       └── ...
└── state
    ├── mainnet
    └── bloatnet

Generally, this structure matches the one under src/ethereum/forks/<forks> for consistency.

Besides the folder restructuring, also rename test name and update the docstring, related rules are as follows:

The doc string for each benchmark files (excluding helper functions), the format should be either Benchmark <category> instructions or Benchmark <precompile> precompiles.
The test name format should be test_<instruction/precompile>_<modifier> (_<modifier> is optional). The rationale behind this format is to enable user to quickly find related test. Example: if i want to know all the tests related to ADD, i could search test_add.

Note:

Some of the outdated comment is removed.
No logic changed for most of the test to prevent potential issue: it would be hard for verification now as we could not use opcode count script now. All the test logic changes are listed below:
I do not remove the tests/benchmark/test_worst_*.py for now in case there are complicate rebase process during PR review
Helper functions/constant values/Enum are grouped into helper file.

Logic Changes
test_worst_precompile_fixed_cost is now split into separate tests for each precompile. Input setup is simplified by replacing multiple MSTOREㄋ with a single CALLDATACOPY that copies the pre-encoded transaction calldata into memory.

Below is the renaming table for reviewer to verify there is no missing benchmark during tests porting.

test_worst_compute.py

Old Test Name	New Test Name	New Location
`test_worst_zero_param`	`test_block_ops`	`instruction/block.py`
`test_worst_zero_param`	`test_environment_ops`	`instruction/environment.py`
`test_worst_zero_param`	`test_gas_op`	`instruction/control_flow.py`
`test_worst_calldatasize`	`test_calldatasize`	`instruction/environment.py`
`test_worst_callvalue`	`test_callvalue`	`instruction/environment.py`
`test_worst_returndatasize_nonzero`	`test_returndatasize_nonzero`	`instruction/environment.py`
`test_worst_returndatasize_zero`	`test_returndatasize_zero`	`instruction/environment.py`
`test_worst_msize`	`test_msize`	`instruction/memory.py`
`test_worst_keccak`	`test_keccak`	`instruction/keccak.py`
`test_worst_precompile_only_data_input`	`test_identity`	`precompile/identity.py`
`test_worst_precompile_only_data_input`	`test_sha256`	`precompile/sha256.py`
`test_worst_precompile_only_data_input`	`test_ripemd160`	`precompile/ripemd160.py`
`test_worst_modexp`	`test_worst_modexp`	`precompile/modexp.py`
`test_worst_precompile_fixed_cost`	`test_alt_bn128`	`precompile/alt_bn128.py`
`test_worst_precompile_fixed_cost`	`test_ecrecover`	`precompile/ecrecover.py`
`test_worst_precompile_fixed_cost`	`test_blake2f`	`precompile/blake2f.py`
`test_worst_precompile_fixed_cost`	`test_point_evaluation`	`precompile/point_evaluation.py`
`test_worst_precompile_fixed_cost`	`test_bls12_381`	`precompile/bls12_381.py`
`test_worst_precompile_fixed_cost`	`test_p256verify`	`precompile/p256verify.py`
`test_worst_jumps`	`test_jumps`	`instruction/control_flow.py`
`test_worst_jumpi_fallthrough`	`test_jumpi_fallthrough`	`instruction/control_flow.py`
`test_worst_jumpis`	`test_jumpis`	`instruction/control_flow.py`
`test_worst_jumpdests`	`test_jumpdests`	`instruction/control_flow.py`
`test_worst_binop_simple`	`test_comparison`	`instruction/comparison.py`
`test_worst_binop_simple`	`test_arithmetic`	`instruction/arithmetic.py`
`test_worst_binop_simple`	`test_bitwise`	`instruction/bitwise.py`
`test_worst_unop`	`test_iszero`	`instruction/comparison.py`
`test_worst_unop`	`test_not_op`	`instruction/bitwise.py`
`test_worst_tload`	`test_tload`	`instruction/storage.py`
`test_worst_tstore`	`test_tstore`	`instruction/storage.py`
`test_worst_shifts`	`test_shifts`	`instruction/bitwise.py`
`test_worst_blobhash`	`test_blobhash`	`instruction/environment.py`
`test_worst_mod`	`test_mod`	`instruction/arithmetic.py`
`test_worst_memory_access`	`test_memory_access`	`instruction/memory.py`
`test_worst_modarith`	`test_mod_arithmetic`	`instruction/arithmetic.py`
`test_empty_block`	`test_empty_block`	`scenarios/diff_tx_types.py`
`test_amortized_bn128_pairings`	`test_bn128_pairings_amortized`	`precompile/alt_bn128.py`
`test_worst_calldataload`	`test_calldataload`	`instruction/environment.py`
`test_worst_swap`	`test_swap`	`instruction/stack.py`
`test_worst_dup`	`test_dup`	`instruction/stack.py`
`test_worst_push`	`test_push`	`instruction/stack.py`
`test_worst_return_revert`	`test_return_revert`	`instruction/system.py`
`test_worst_clz_same_input`	`test_clz_same`	`instruction/bitwise.py`
`test_worst_clz_diff_input`	`test_clz_diff`	`instruction/bitwise.py`

test_worst_opcode.py

Old Test Name	New Test Name	New Location
`test_worst_log_opcodes`	`test_log`	`instruction/log.py`

test_worst_memory.py

Old Test Name	New Test Name	New Location
`test_worst_calldatacopy`	`test_calldatacopy`	`instruction/environment.py`
`test_worst_codecopy`	`test_codecopy`	`instruction/environment.py`
`test_worst_returndatacopy`	`test_returndatacopy`	`instruction/environment.py`
`test_worst_mcopy`	`test_worst_mcopy`	`instruction/memory.py`

test_worst_blocks.py

Old Test Name	New Test Name	New Location
`test_block_full_of_ether_transfers`	`test_block_full_of_ether_transfers`	`scenarios/diff_tx_types.py`
`test_block_full_data`	`test_block_full_data`	`scenarios/diff_tx_types.py`
`test_block_full_access_list_and_data`	`test_block_full_access_list_and_data`	`scenarios/diff_tx_types.py`
`test_worst_case_auth_block`	`test_worst_case_auth_block`	`scenarios/diff_tx_types.py`

test_worst_stateful_opcodes.py

Old Test Name	New Test Name	New Location
`test_worst_address_state_cold`	`test_worst_address_state_cold`	`instruction/environment.py`
`test_worst_address_state_warm`	`test_worst_address_state_warm`	`scenarios/mix_operations.py`
`test_worst_storage_access_cold`	`test_storage_access_cold`	`instruction/storage.py`
`test_worst_storage_access_warm`	`test_storage_access_warm`	`instruction/storage.py`
`test_worst_blockhash`	`test_blockhash`	`instruction/block.py`
`test_worst_selfbalance`	`test_selfbalance`	`instruction/environment.py`
`test_worst_extcodecopy_warm`	`test_extcodecopy_warm`	`instruction/environment.py`
`test_worst_selfdestruct_existing`	`test_selfdestruct_existing`	`instruction/system.py`
`test_worst_selfdestruct_created`	`test_selfdestruct_created`	`instruction/system.py`
`test_worst_selfdestruct_initcode`	`test_selfdestruct_initcode`	`instruction/system.py`

test_worst_bytecode.py

Old Test Name	New Test Name	New Location
`test_worst_bytecode_single_opcode`	`test_extcode_ops`	`instruction/environment.py`
`test_worst_bytecode_single_opcode`	`test_xcall`	`instruction/system.py`
`test_worst_initcode_jumpdest_analysis`	`test_jumpdest_analysis`	`scenarios/mix_operations.py`
`test_worst_create`	`test_create`	`instruction/system.py`
`test_worst_creates_collisions`	`test_creates_collisions`	`instruction/system.py`

🔗 Related Issues or PRs

Issue: #1570

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx tox -e static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered adding an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

LouisTsai-Csie · 2025-10-28T13:49:32Z

I have not yet removed the old test files, this is to avoid future merge conflict overhead. I could delete them really quick after not much changes required for this PR.

danceratopz

Thanks @LouisTsai-Csie, the new structure looks great!

Unfortunately, we'll need to refactor the imports from the EEST libraries to match the changes in:

#1654

I just approved the workflow (sorry about that this is disabled for you right now, you'll get collab rights to the this repo soon, it's been requested!). As expected, the workflow has failed due to the change in the execution_testing package layout.

tests/benchmark/compute/helpers.py

danceratopz

Thanks for fixing up the imports @LouisTsai-Csie! I checked them all manually to see if we need to improve execution_testing and was happy to see that all the objects are already available as a direct import from the package 👍

In my initial review, I completely missed that the new "test modules" are no longer prefixed with test_. This looks very clean! But then I wondered why it worked, as pytest (by default) only collects modules that use the test_ prefix. It turns out that it works, because we have:

execution-specs/packages/testing/src/execution_testing/cli/pytest_commands/pytest_ini_files/pytest-fill.ini

Line 4 in 15b5cfd

python_files = *.py

I reverted this config in #1710 to explicitly use the default value of test_*.py:

#1710

To make sure that test collection will still work with the naming convention in this PR and #1710, I pushed one small conftest config that will future-proof this PR for that change!

Ready to merge from my side, except for the remaining tests/benchmark/test_worst_* modules that still need to be ported.

LouisTsai-Csie · 2025-10-30T08:18:13Z

Hi @danceratopz , all the test_worst_*.py files are ported, so the last commit remove these files! PTAL

danceratopz

As per team meeting, please rename any python module that contains a "test" to be prefixed test_*.py. Folders are ok, as-is.

marioevz · 2025-10-30T16:45:56Z

I pushed a fix to the naming of one fixture, but I'm still reviewing.

marioevz · 2025-10-30T18:27:19Z

Pushed change in 5f001b6.

It splits compute/instruction/test_environment.py into test_block_context.py and test_tx_context.py files.

It further adds test_call_context.py and test_account_query.py to separate opcodes that perform such operations.

fselmo

lgtm... love the refactor 👌🏼

codecov-commenter · 2025-10-31T00:03:01Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.07%. Comparing base (15b5cfd) to head (5f001b6).
⚠️ Report is 11 commits behind head on forks/osaka.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@             Coverage Diff              @@
##           forks/osaka    #1681   +/-   ##
============================================
  Coverage        86.07%   86.07%           
============================================
  Files              743      743           
  Lines            44078    44078           
  Branches          3894     3894           
============================================
  Hits             37938    37938           
  Misses            5659     5659           
  Partials           481      481

Flag	Coverage Δ
unittests	`86.07% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

LouisTsai-Csie force-pushed the benchmark-folder-structure branch from a5b2dee to 26d7e56 Compare October 27, 2025 05:07

LouisTsai-Csie marked this pull request as ready for review October 27, 2025 05:26

danceratopz added the A-test-benchmark Area: Tests Benchmarks—Performance measurement (eg. `tests/benchmark/*`, `p/t/s/e/benchmark/*`) label Oct 27, 2025

LouisTsai-Csie force-pushed the benchmark-folder-structure branch from 26d7e56 to 08e6754 Compare October 27, 2025 14:55

spencer-tb self-requested a review October 28, 2025 14:42

danceratopz reviewed Oct 28, 2025

View reviewed changes

tests/benchmark/compute/helpers.py Outdated Show resolved Hide resolved

LouisTsai-Csie force-pushed the benchmark-folder-structure branch 2 times, most recently from c686a0d to 8efe2d3 Compare October 29, 2025 07:41

LouisTsai-Csie self-assigned this Oct 29, 2025

danceratopz reviewed Oct 29, 2025

View reviewed changes

LouisTsai-Csie force-pushed the benchmark-folder-structure branch from 688dce6 to fa51f56 Compare October 30, 2025 08:16

LouisTsai-Csie requested a review from danceratopz October 30, 2025 08:18

danceratopz reviewed Oct 30, 2025

View reviewed changes

LouisTsai-Csie added 9 commits October 30, 2025 22:16

refactor: compute instruction folder

f6291c9

refactor: compute precompile folder

76ca86a

refactor: compute scenario folder

58cd9d0

feat: add general benchmark helper

b8473a9

refactor: update import path

97e96bb

fix: resolve mypy issue

d6f4397

fix: add missing variable

ac2613c

chore: remove ported files

c433c8c

refactor: rename the test name with prefix

b63cde3

LouisTsai-Csie force-pushed the benchmark-folder-structure branch from fa51f56 to b63cde3 Compare October 30, 2025 14:20

fix(tests): Fixture name

ff29967

refactor(tests): Split test_environment.py

5f001b6

fselmo approved these changes Oct 30, 2025

View reviewed changes

marioevz changed the title ~~feat(benchmark): general folder restructuring~~ refactor(tests): General benchmark folder restructuring Oct 30, 2025

marioevz merged commit 533299f into ethereum:forks/osaka Oct 31, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(tests): General benchmark folder restructuring #1681

refactor(tests): General benchmark folder restructuring #1681

Uh oh!

LouisTsai-Csie commented Oct 25, 2025 •

edited

Loading

Uh oh!

LouisTsai-Csie commented Oct 28, 2025

Uh oh!

danceratopz left a comment

Uh oh!

Uh oh!

danceratopz left a comment

Uh oh!

LouisTsai-Csie commented Oct 30, 2025

Uh oh!

danceratopz left a comment

Uh oh!

marioevz commented Oct 30, 2025

Uh oh!

marioevz commented Oct 30, 2025

Uh oh!

fselmo left a comment

Uh oh!

codecov-commenter commented Oct 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

refactor(tests): General benchmark folder restructuring #1681

refactor(tests): General benchmark folder restructuring #1681

Uh oh!

Conversation

LouisTsai-Csie commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

test_worst_compute.py

test_worst_opcode.py

test_worst_memory.py

test_worst_blocks.py

test_worst_stateful_opcodes.py

test_worst_bytecode.py

🔗 Related Issues or PRs

✅ Checklist

Cute Animal Picture

Uh oh!

LouisTsai-Csie commented Oct 28, 2025

Uh oh!

danceratopz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

danceratopz left a comment

Choose a reason for hiding this comment

Uh oh!

LouisTsai-Csie commented Oct 30, 2025

Uh oh!

danceratopz left a comment

Choose a reason for hiding this comment

Uh oh!

marioevz commented Oct 30, 2025

Uh oh!

marioevz commented Oct 30, 2025

Uh oh!

fselmo left a comment

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Oct 31, 2025

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

LouisTsai-Csie commented Oct 25, 2025 •

edited

Loading