Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add migration tests #342

Closed
wants to merge 3 commits into from
Closed

Add migration tests #342

wants to merge 3 commits into from

Conversation

litt3
Copy link
Collaborator

@litt3 litt3 commented Mar 19, 2025

Closes DAINT-376

  • Uses an atomic bool for v2 dispersal, so it can be modified in tests
  • Implements some simple migration tests

@litt3 litt3 self-assigned this Mar 19, 2025
@litt3 litt3 changed the base branch from main to config-cleanup March 19, 2025 18:21
@litt3 litt3 mentioned this pull request Mar 19, 2025
Copy link
Collaborator

@samlaf samlaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass on architecture. Haven't looked at tests yet. We can chat offline regarding:
image

Comment on lines +178 to 179
if svr.sm.IsV2Enabled() {
commitmentMeta.Version = commitments.CertV1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug: should this set commits.CertV2?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No- this is how the cert versions are defined

const (
	// EigenDA V1
	CertV0 EigenDACommitmentType = iota
	// EigenDA V2
	CertV1
)

Not crazy about this, I've just been carrying it forward

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh.... kek. Should we rename? CertV0 makes no sense. We can still keep the iota numbering starting from 0. But at least the name will make sense. @ethenotethan thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've no strong feelings for changing vs not. I've been generally trying (maybe failing) to keep diffs as small as possible, which is why I left it thus far

server/server.go Outdated
@@ -24,7 +24,6 @@ var (
)

type Server struct {
cfg config.ServerConfig
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this being removed here? was it just not used anymore? why?

Also right now we have like config/server_config under the config/ dir, but the metrics config is kept under metrics/. Seems a bit nonuniform.

I think I personally prefer keeping configs close to the structs they are used in. What was your motivation for moving configs to the high-level config/ dir?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving some configs to the config/ package was a consequence of how flags are being used

Since the flags for a given category are defined in the various packages, I was struggling with circular imports. It's possible there is a better package structure out there, I'm just not sure what it would be

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Prob just need to pop the config and flags in a subpackage of each respective package. Like server/config/config.go and server/flags/flags.go. I think that would resolve it. But also totally fine to leave as is.

Copy link
Collaborator Author

@litt3 litt3 Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave as-is for now, and keep in mind potential future proxy package structure improvements (maybe as part of merging proxy into core)

Copy link
Collaborator

@samlaf samlaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests review

Comment on lines +210 to +220
// TODO: @ethen can you help me formulate a description of what this sequence of calls is doing?
func exerciseGenericCommitments(
t *testing.T,
ot actions.StatefulTesting,
proxyTS testutils.TestSuite,
optimism *L2AltDA,
) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's try to refactor this function into smaller understandable parts. For example feels like we should be able to extract a "buildL1Block" function, but right now block1 and block2 are being built differently. Probably can refactor and extract out the FinalizationSignal in block2.

Also I personally think the require() should be moved out of this function and into tests. I prefer to see all the require statements when looking at a test. Helps see with a single look what exactly is being tested.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9bfbda9

Made a few tweaks, and left a note to improve further in the future

@litt3 litt3 requested a review from ethenotethan March 19, 2025 19:44
@litt3 litt3 marked this pull request as ready for review March 19, 2025 19:44
@litt3 litt3 requested a review from anupsv March 19, 2025 21:26
defer shutDown()

// turn off v2 dispersal, so that we exercise v1, i.e. "pre migration"
proxyTS.Server.SetDisperseV2(false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely out of scope for this PR but would be nice to remove this Boolean longterm in favor of some 'ProtocolVersion' enum - pseudo equivalent to how node software like geth maintains hard fork versioning

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldnt that just require us to set an arbitrary timestamp at which the setDisperseV2 should happen?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldnt that just require us to set an arbitrary timestamp at which the setDisperseV2 should happen?

hmm you could introduce scheduling but feel it'd be unnecessary since the rollup itself is the control plane. Maybe for migrating arbitrum we could expose some admin API that the nodes would call when an ArbSysConfig update happens. This might be over-engineering though but definitely something to consider in the future if we introduce versioning with actual breaking changes where proxy has to have stricter switching between version subsections (e.g, if we change how secondary storage keys are generated). This opens a greater question though if there should be some delineation between the DA byte version and the actual EigenDA protocol version.

Currently we're mapping da byte 1:1 w/ EigenDA protocol but we need to bump this byte in the future irrespective of an EigenDA upgrade!

// turn off v2 dispersal, so that we exercise v1, i.e. "pre migration"
proxyTS.Server.SetDisperseV2(false)

ot := actions.NewDefaultTesting(t)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we actually asserting that proxy is dual processing both V1 and V2 if we're only verifying dispersal/retrieval based on a generalized metrics vector. Would there be merit in extending the telemetry to also support a versioning field?

Copy link
Collaborator

@samlaf samlaf Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should def separate out the prometheus metrics regardless of whether its used in tests or not

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue for guaranteed migration test coverage its absolutely necessary

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the assertions here just aren't strong enough.

I think it makes sense to create a follow-up PR, I created an issue

store/manager.go Outdated
writeV2 bool // write blobs to EigenDAV2 backend
eigenda common.GeneratedKeyStore // v0 da commitment version
eigendaV2 common.GeneratedKeyStore // v1 da commitment version
disperseV2 *atomic.Bool // disperse blobs to EigenDAV2 backend
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - does this actually need to be atomic? Iiuc there's isn't a current test case which distributes client load in parallel when switching the version flag

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be atomic to prevent race condition from proxy reading it, while test writing to it. Prob innocuous race condition but still better to be race free

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making it atomic also keeps the door open for exposing the flag to users, if there would ever be a use case. Maybe not likely, but the atomicity cost is negligible at the rates we'll see here.

@@ -215,7 +231,9 @@ func testOptimismGenericCommitment(t *testing.T, v2Enabled bool) {

optimism.sequencer.ActL2PipelineFull(ot)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this enact the entire derivation pipeline using the first L1 batch or does it just catch up from a previous batch state checkpoint?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I just utilized what was already there for this test. Sam and I tried to go through and understand what precisely was happening in this test, but I didn't feel like we really arrived at a deep level of understanding.

Comment on lines 274 to 279
optimism := NewL2AltDA(ot, proxyTS.Address(), true)
exerciseGenericCommitments(t, ot, optimism)

// turn on v2 dispersal
proxyTS.Server.SetDisperseV2(true)
exerciseGenericCommitments(t, ot, optimism)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this testing that a migration can be applied against OP which re-derives from an inbox that contains both V1 and V2 certs or is it just applying the migration to do single step update/derive?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, I'm not super confident in my understanding of what these tests are doing.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
* fix: Standard output check

* fix: Standard output check - make

* fix: Standard output check - ignore version

* fix: Standard output check - update sed cmd for ubuntu compatibility

* fix: Standard output check - update sed cmd for ubuntu compatibility
@litt3 litt3 mentioned this pull request Mar 24, 2025
litt3 and others added 2 commits March 24, 2025 12:19
* Do some test cleanup

Signed-off-by: litt3 <[email protected]>

* Tweak more configurations

Signed-off-by: litt3 <[email protected]>

* Tweak log levels

Signed-off-by: litt3 <[email protected]>

* Add makefile doc

Signed-off-by: litt3 <[email protected]>

* Use better flag name

Signed-off-by: litt3 <[email protected]>

* Fix test type

Signed-off-by: litt3 <[email protected]>

* Add test matrix

Signed-off-by: litt3 <[email protected]>

* Tweak per pr yaml

Signed-off-by: litt3 <[email protected]>

* Remove irrelevant vars from settings.json

Signed-off-by: litt3 <[email protected]>

* Fix fuzz description

Signed-off-by: litt3 <[email protected]>

* Remove coverage from CI

Signed-off-by: litt3 <[email protected]>

* Fix test timeout

Signed-off-by: litt3 <[email protected]>

* Don't bother excluding fuzz tests from unit tests

Signed-off-by: litt3 <[email protected]>

* Decrease boilerplate

Signed-off-by: litt3 <[email protected]>

* Rename Environment to Backend

Signed-off-by: litt3 <[email protected]>

* Remove test matrix util

Signed-off-by: litt3 <[email protected]>

* Split benchmark test

Signed-off-by: litt3 <[email protected]>

* Try alternate test config strategy

Signed-off-by: litt3 <[email protected]>

* Fix failing tests

Signed-off-by: litt3 <[email protected]>

* test(fuzz): fix fuzz tests (#339)

Cleaned up the fuzz tests to actually do something
Reduced seed corpus to 3 meaningful inputs
Removed cluttering log outputs so we can actually see the fuzzer output as its finding new interesting cases

* Draft config cleanup

Signed-off-by: litt3 <[email protected]>

* Clean up

Signed-off-by: litt3 <[email protected]>

* Configure secrets correctly

Signed-off-by: litt3 <[email protected]>

* Iterate

Signed-off-by: litt3 <[email protected]>

* Iterate again

Signed-off-by: litt3 <[email protected]>

* Get relay registry address correctly

Signed-off-by: litt3 <[email protected]>

* Share code pathways between tests and production

Signed-off-by: litt3 <[email protected]>

* Clean up

Signed-off-by: litt3 <[email protected]>

* Fix docker

Signed-off-by: litt3 <[email protected]>

* Redact additional sensitive fields

Signed-off-by: litt3 <[email protected]>

* Disable verification for memstore mode

Signed-off-by: litt3 <[email protected]>

* Further simplify test configuration

Signed-off-by: litt3 <[email protected]>

* Fix test path

Signed-off-by: litt3 <[email protected]>

* Reenable test

Signed-off-by: litt3 <[email protected]>

* Fix more test paths

Signed-off-by: litt3 <[email protected]>

* Fix config tests

Signed-off-by: litt3 <[email protected]>

* Increase holesky e2e timeout

Signed-off-by: litt3 <[email protected]>

* Support preprod tests

Signed-off-by: litt3 <[email protected]>

* Fix yml

Signed-off-by: litt3 <[email protected]>

* Clean up

Signed-off-by: litt3 <[email protected]>

* Properly configure g2

Signed-off-by: litt3 <[email protected]>

* Checkout with lfs

Signed-off-by: litt3 <[email protected]>

* Try to get SRS Order right

Signed-off-by: litt3 <[email protected]>

* Revert SRS changes

Signed-off-by: litt3 <[email protected]>

* Temporarily disable dispersal verification

Signed-off-by: litt3 <[email protected]>

* Try to fix fuzz test

Signed-off-by: litt3 <[email protected]>

* Revert unnecessary changes

Signed-off-by: litt3 <[email protected]>

* Use experimental g2 point loading

Signed-off-by: litt3 <[email protected]>

* Revert change to G1

Signed-off-by: litt3 <[email protected]>

* Remove power of 2 g2 points

Signed-off-by: litt3 <[email protected]>

* Add deprecated flag

Signed-off-by: litt3 <[email protected]>

* Track deprecated flags

Signed-off-by: litt3 <[email protected]>

* Remove unnecessary edge case

Signed-off-by: litt3 <[email protected]>

* Fix broken V2 verify

Signed-off-by: litt3 <[email protected]>

* Reenable client side proving

Signed-off-by: litt3 <[email protected]>

* Declutter make tests

Signed-off-by: litt3 <[email protected]>

* Fix the test filter regex

Signed-off-by: litt3 <[email protected]>

* Update core version

Signed-off-by: litt3 <[email protected]>

* Try again to get the correct make command

Signed-off-by: litt3 <[email protected]>

* Prevent grep from buffering

Signed-off-by: litt3 <[email protected]>

* Use pipefail

Signed-off-by: litt3 <[email protected]>

* Give up on filtering

Signed-off-by: litt3 <[email protected]>

* Fix error message

Signed-off-by: litt3 <[email protected]>

* Add clarifying comment to v2 config struct

Signed-off-by: litt3 <[email protected]>

* Rename to MetricsConfig

Signed-off-by: litt3 <[email protected]>

* Fix terrible misnaming

Signed-off-by: litt3 <[email protected]>

* Simplify steps for overriding test flag configs

Signed-off-by: litt3 <[email protected]>

* Remove unused member var

Signed-off-by: litt3 <[email protected]>

* Fix small typos

Signed-off-by: litt3 <[email protected]>

* Make backend error message better

Signed-off-by: litt3 <[email protected]>

* Improve explanation of kzg behavior

Signed-off-by: litt3 <[email protected]>

* Improve v2 terminology and consistency

Signed-off-by: litt3 <[email protected]>

* Improve config ordering

Signed-off-by: litt3 <[email protected]>

* Revert name change

Signed-off-by: litt3 <[email protected]>

* Improve const names

Signed-off-by: litt3 <[email protected]>

* Change flag deprecation strategy

Signed-off-by: litt3 <[email protected]>

* First pass

Signed-off-by: litt3 <[email protected]>

* Move utility method

Signed-off-by: litt3 <[email protected]>

* Revert formatting changes

Signed-off-by: litt3 <[email protected]>

* Add Enabled to memstore config marshal

Signed-off-by: litt3 <[email protected]>

* Revert benchmark changes

Signed-off-by: litt3 <[email protected]>

* More reversions

Signed-off-by: litt3 <[email protected]>

* Move method in Setup

Signed-off-by: litt3 <[email protected]>

* Revert flags changes

Signed-off-by: litt3 <[email protected]>

* Newline change

Signed-off-by: litt3 <[email protected]>

* Revert formatting changes

Signed-off-by: litt3 <[email protected]>

* Revert secondary source changes

Signed-off-by: litt3 <[email protected]>

* Revert newline

Signed-off-by: litt3 <[email protected]>

* Fix tests

Signed-off-by: litt3 <[email protected]>

* Fix kzg config init

Signed-off-by: litt3 <[email protected]>

* Revert grep addition

Signed-off-by: litt3 <[email protected]>

* Fix test

Signed-off-by: litt3 <[email protected]>

* Remove accidental grep

Signed-off-by: litt3 <[email protected]>

* Try to get flags right again

Signed-off-by: litt3 <[email protected]>

* Split out test suite

Signed-off-by: litt3 <[email protected]>

* Split out verifier and kzg flags

Signed-off-by: litt3 <[email protected]>

* Shuffle around enable flag

Signed-off-by: litt3 <[email protected]>

* Update help_out

Signed-off-by: litt3 <[email protected]>

* Remove version

Signed-off-by: litt3 <[email protected]>

* Add missed config value

Signed-off-by: litt3 <[email protected]>

* Fix lint

Signed-off-by: litt3 <[email protected]>

* Update flag categories

Signed-off-by: litt3 <[email protected]>

* Lengthen e2e local timeout

Signed-off-by: litt3 <[email protected]>

---------

Signed-off-by: litt3 <[email protected]>
Co-authored-by: Samuel Laferriere <[email protected]>
Signed-off-by: litt3 <[email protected]>
@litt3 litt3 force-pushed the migration-tests branch from d8bb63f to 55d6964 Compare March 24, 2025 16:22
@litt3 litt3 closed this Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants