refactor(tests): Use pytest collection to load JSON fixtures #1666

marioevz · 2025-10-23T18:31:01Z

🗒️ Description

This PR refactors the blockchain and state test infrastructure to leverage pytest's native collection mechanism via pytest_collect_file, eliminating redundant JSON file reads and improving test execution efficiency.

Key Improvements

Native pytest Collection

Implements pytest_collect_file hook to collect tests directly from JSON files during pytest's discovery phase
Each JSON file is now read exactly once during collection, rather than being read multiple times during parameterization and execution
Test fixtures are created as pytest Item objects (e.g., BlockchainTestFixture, StateTestFixture) that encapsulate all test data

Eliminated Redundant File I/O

Before: JSON files were read during test parameterization (fetch_blockchain_tests) and again during test execution (run_blockchain_st_test)
After: JSON files are read once in FixturesFile.collect(), and test data is stored in fixture objects for later execution
Removes intermediate dictionaries passing file paths that triggered repeated file reads

Cleaner Architecture

Introduces Fixture base class for shared fixture behavior
Test execution logic moved into runtest() methods of fixture classes
Test metadata (markers, fork info) configured during collection rather than parameterization
Eliminates the need for custom idfn functions - pytest handles naming automatically

Performance Impact

This refactoring significantly reduces I/O overhead for large test suites where the same JSON files contain multiple test cases across different forks.

Open Issues

Some failing tests still that need to be investigated, for now I'd like to start running this in CI and see how it improves execution speed.

🔗 Related Issues or PRs

N/A.

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx --with=tox-uv tox -e static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered adding an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).

Cute Animal Picture

SamWilsn · 2025-10-23T19:58:30Z

tests/json_infra/conftest.py

+            # Remove any python files in the downloaded files to avoid
+            # importing them.
+            for python_file in glob(
+                os.path.join(fixture_path, "**/*.py"), recursive=True
+            ):
+                try:
+                    os.unlink(python_file)
+                except FileNotFoundError:
+                    # Not breaking error, another process deleted it first
+                    pass
+


This feels... strange? I can't quite put my finger on why.

Like, why do the fixtures contain python files at all? Is there another way we could accomplish the same thing (like excluding a directory)?

I dunno, this just triggers my spidey sense 🤣

This is the culprit: https://github.com/ethereum/legacytests/tree/1f581b8ccdc4c63acf5f2c5c1b155c690c32a8eb/src/LegacyTests/Cancun/GeneralStateTestsFiller/Pyspecs

Checking out ethereum/tests at this commit, when submodules are included, results in these python files being checked out too, and when collecting ./tests/json_infra/fixtures for JSON files, pytest tries to collect these files too.

Don't we exclude that directory on the command line?

I removed that because with this approach the files are collected directly by pytest, as opposed to doing a glob in the test itself.

SamWilsn · 2025-10-23T20:00:36Z

tests/json_infra/helpers/__init__.py

+ALL_FIXTURE_TYPES.append(BlockchainTestFixture)
+ALL_FIXTURE_TYPES.append(StateTestFixture)


Do these get executed when importing only, for example, .load_state_tests? From my limited knowledge of Python's import machinery, I would guess yes, but I'm just checking.

Yes that's correct, it gets executed only when importing from .helpers. If we were to, for example, import directly from .helpers.fixtures, this logic would not be executed and ALL_FIXTURE_TYPES would be empty, so it is indeed a bit brittle if being honest.

Oh really? I thought parent modules were implicitly imported. I'm glad I checked!

SamWilsn · 2025-10-23T20:02:37Z

tests/json_infra/helpers/exceptional_test_patterns.py

    big_memory: Tuple[Pattern[str], ...]


+@lru_cache


How often is this called to require an lru_cache? O.o

Depending on when the cache is populated (in worker vs. in master), using lru_cache can explode memory: each worker has its own cache.

I removed it thinking it might reduce the memory footprint and it did by half a GB, but it still consumes around 30GB+ because all fixtures are in memory when running.

* zkevm: add BLOBHASH benchs Signed-off-by: Ignacio Hagopian <[email protected]> * generalize params Signed-off-by: Ignacio Hagopian <[email protected]> * improvements Signed-off-by: Ignacio Hagopian <[email protected]> --------- Signed-off-by: Ignacio Hagopian <[email protected]>

SamWilsn · 2025-10-24T18:03:53Z

I was thinking briefly about this. I also know next to nothing about pytest, so this might not make any sense at all, but...

What if we use an LRU cache for the JSON files (one per worker), and loadgroup all the tests that come from the same file?

So you'd read once during collection, find all the tests and group them by file, then while running the tests you minimize the number of times you need to re-read the same file.

fix(tests): Don't cache fixtures Try to implement cache Fix caching feat(tests): Manage cache during execution

marioevz force-pushed the refactor-json-infra branch from d18197e to 8503878 Compare October 23, 2025 18:41

SamWilsn reviewed Oct 23, 2025

View reviewed changes

This was referenced Oct 24, 2025

Optimize the json_infra tests #1605

Closed

Investigate and optimize running filled tests #1020

Open

marioevz added 11 commits October 31, 2025 22:45

refactor(tests): Refactor json_infra using pytest_collect_file

a149167

fix(tests): json collecting

419a1c1

fix(tests): blockchain test execution

8fd5382

fix(tests): blockchain test execution

fcb7ace

refactor(tests): Refactor types in json_infra

e64c6f4

fix(tests): json_infra, imports, parse exceptions in some tests

82d71cc

refactor(tests): move some definitions

7abe78c

fix(tox.ini): Remove --ignore-glob

d9f495f

fix(tests): workaround for FileNotFoundError

0fb8a26

fix(tests): revamp cache

057fe10

fix(tests): Don't cache fixtures Try to implement cache Fix caching feat(tests): Manage cache during execution

fix(tox): Use --dist=loadfile

c6408c9

marioevz force-pushed the refactor-json-infra branch from 53e92c6 to c6408c9 Compare November 1, 2025 00:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(tests): Use pytest collection to load JSON fixtures #1666

refactor(tests): Use pytest collection to load JSON fixtures #1666

Uh oh!

marioevz commented Oct 23, 2025

Uh oh!

SamWilsn Oct 23, 2025

Uh oh!

marioevz Oct 23, 2025

Uh oh!

SamWilsn Oct 23, 2025

Uh oh!

marioevz Oct 24, 2025

Uh oh!

SamWilsn Oct 23, 2025

Uh oh!

marioevz Oct 23, 2025

Uh oh!

SamWilsn Oct 23, 2025

Uh oh!

SamWilsn Oct 23, 2025

Uh oh!

marioevz Oct 23, 2025

Uh oh!

SamWilsn Oct 23, 2025

Uh oh!

SamWilsn commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		ALL_FIXTURE_TYPES.append(BlockchainTestFixture)
		ALL_FIXTURE_TYPES.append(StateTestFixture)

refactor(tests): Use pytest collection to load JSON fixtures #1666

Are you sure you want to change the base?

refactor(tests): Use pytest collection to load JSON fixtures #1666

Uh oh!

Conversation

marioevz commented Oct 23, 2025

🗒️ Description

Key Improvements

Performance Impact

Open Issues

🔗 Related Issues or PRs

✅ Checklist

Cute Animal Picture

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SamWilsn commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants