Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sui system] implementing SIP-39 #19836

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sblackshear
Copy link
Collaborator

@sblackshear sblackshear commented Oct 12, 2024

Implement the new validator admission scheme described in SIP-39 that uses voting power thresholds rather than stake thresholds. Conceptual details are in SIP-39, but here's what changes at the code level:

  • Deprecate the min_validator_joining_stake, validator_low_stake_threshold, validator_very_low_stake_threshold constants
  • Use voting power-based module constants defined in validator_set::get_voting_power_thresholds asn the new source of truth for (min, low, very low) voting power
  • Change low stake departure logic to use these voting power thresholds instead of stake thresholds
  • Add tests for each of the relevant scenarios: admission with appropriate voting power, failure to admit without appropriate voting power, dismissal if very low stake threshold is hit, eventual dismissal or recovery if low stake threshold is hit

Copy link

vercel bot commented Oct 12, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 29, 2025 9:04pm
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Jan 29, 2025 9:04pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Jan 29, 2025 9:04pm
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Jan 29, 2025 9:04pm

}

/// return (min, low, very low voting power) thresholds
fun get_voting_power_thresholds(ctx: &TxContext): (u64, u64, u64) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxfind, @emmazzz told me it would be ok/preferable to use module constants for these instead of SuiSystemStateV2. So the way I handled this is marking the old stake-based constants in SuiSystemV2 as deprecated, and using this as the source of truth for the new voting-power based constants. Let me know if that doesn't make sense, and if there's anything should change in the Rust version of SuiSystemV2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using module constants is fine however I don't see a way for these to be different across different networks.
Since you are not adding any new parameters, I guess we could technically rename the existing parameter fields and repurpose them.
But in order to update them differently for different networks, we might have to introduce a separate private entry function called update_parameters, which are then called at epoch change based on the protocol configs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need different values for these in different networks?

@@ -313,90 +300,171 @@ module sui_system::validator_set_tests {
}

#[test]
fun test_low_stake_departure() {
#[expected_failure(abort_code = validator_set::ENotAValidator)]
fun test_request_add_then_pull_stake() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emmazzz I added a test for this (didn't see one previously), and wasn't sure what we would expect the outcome to be. A failure in request_withdraw_stake (what happens) seems like a reasonable answer, but was wondering if we should use a more descriptive error code than ENotAValidator. The specific problem here is (I think) that the validator is pending addition, and the stake of a pending validator cannot be withdrawn until the next epoch (when the validator will have been added)


/// return (min, low, very low voting power) thresholds
fun get_voting_power_thresholds(ctx: &TxContext): (u64, u64, u64) {
// TODO: adjust as needed depending on when this will ship. and how will we deal with the devnet/testnet/mainnet differences?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably best to use protocol version for this maybe?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this, but I think it actually doesn't work unless we assume that upgrades happen every 2 weeks and every upgrade is a new protocol version. And since we'd have to hardcode the start protocol version anyway, we might as well encode the start epoch and keep it simple.

The good thing is that start_epoch can be set as conservatively as we want--e.g., if we anticipated this framework upgrade shipping on 10/30, there'd be no problem with setting the start_epoch as 11/30

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I more meant that once this code lands and has been upgraded on a network we can immediately start using it the next epoch. Hard coding an epoch number likely will not work since we have different networks at different epochs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really unclear how this is ever going to work across networks if not with a native function that exposes the concept somehow. Can someone share the idea here?
Also I am not clear in which networks this code matters, all of them including local ones?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the nice Manos suggestion below will fix the X-network issue

@dariorussi
Copy link
Contributor

please let's make sure that @damirka and/or @manolisliolios have reviewed and accepted this before it goes in


/// return (min, low, very low voting power) thresholds
fun get_voting_power_thresholds(ctx: &TxContext): (u64, u64, u64) {
// TODO: adjust as needed depending on when this will ship. and how will we deal with the devnet/testnet/mainnet differences?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am really unclear how this is ever going to work across networks if not with a native function that exposes the concept somehow. Can someone share the idea here?
Also I am not clear in which networks this code matters, all of them including local ones?

@@ -404,12 +412,7 @@ module sui_system::sui_system_state_inner {
self: &mut SuiSystemStateInnerV2,
ctx: &TxContext,
) {
assert!(
Copy link
Contributor

@manolisliolios manolisliolios Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to worry about the vector size growing more than we are ok with? Either on object size limits, or execution limits?

Copy link
Collaborator Author

@sblackshear sblackshear Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, yes. To be defensive, perhaps what we should do is add a new threshold based on max object size rather than validator count?

Copy link
Collaborator Author

@sblackshear sblackshear Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this a bit:

  • If adding a validator would hit object size limits, the transaction will abort with that error. That's a bit less nice than aborting with ELimitExceeded, but the effect will be the same, and it's hard to calculate the exact limit on validator count that the object size limits imply (since it depends on lots of variables like metadata size)
  • On execution limits: advance_epoch runs with metering off (I double-checked this by adding an infinite loop inside it and seeing tests that advance epochs diverge). So if the validator set is very big, advance epoch may take longer than expected, but it won't run into gas limits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sblackshear

  1. Yup I can imagine it's not trivial to calculate this and also the effects would be bad regardless if we were catching it on advance_epoch. We would need to enforce it before adding a validator to the pending set to be safe.
  2. Ah, that's great! That was my major concern on the increased size of the vector.

Comment on lines 511 to 514
// prior to this line, `self.total_stake` is the previous epoch's total stake. we want to update it to the current total stake
// before processing low stake departures, otherwise the `total_stake` denominator in our voting power calculations will be incorrect.
// note: if a validator is kicked out in the loop below, `total_stake` will be decreased by `process_validator_departure`. otherwise,
// `total_stake` will be unchanged by the end of the function
Copy link
Contributor

@manolisliolios manolisliolios Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that in the previous implementation, we first triggered the processing of departures, then set the total_stake based on the "remaining" active set of validators (from scratch).

Here, we calculate it and set it before we process the departing validators, then allow the departure fn to adjust it.

Are we sure this cannot cause any side-effect issues in any way? I do see that on the codepath, we do deduct the total_stake when we actually process a validator departure, however I thought I'd triple-check / validate my assumptins!

Copy link
Collaborator Author

@sblackshear sblackshear Dec 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read through this and convinced myself that this way is indeed correct. But this made me realize that the statefulness of total_stake makes this code very hard to reason about indeed... I did some light refactoring to only write to total_stake in one location, and make sure that all logic in validator_set and voting_power is using the same value (e.g., there were separate functions in each module for computing total stake before). PTAL and let me know if you also feel more confident with the new setup!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sam. I think the change indeed makes it easier to follow! The consistency we get on voting_power is a great deal.

manolisliolios
manolisliolios previously approved these changes Dec 24, 2024
@sblackshear sblackshear temporarily deployed to sui-typescript-aws-kms-test-env December 25, 2024 01:47 — with GitHub Actions Inactive
@@ -177,7 +177,9 @@ module sui_system::staking_pool {
pool.pending_pool_token_withdraw = pool.pending_pool_token_withdraw + pool_token_withdraw_amount;

// If the pool is inactive, we immediately process the withdrawal.
if (is_inactive(pool)) process_pending_stake_withdraw(pool);
if (is_inactive(pool) || pool.is_preactive()) {
Copy link
Contributor

@manolisliolios manolisliolios Dec 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we touch the line 😅

Suggested change
if (is_inactive(pool) || pool.is_preactive()) {
if (pool.is_inactive() || pool.is_preactive()) {

I'll defer to @emmazzz to validate the correctness of this. It feels OK, but this indeed could open side-effects that are not obvious here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is okay, as long as we have added a test for it. When we deposit stake into a preactive pool, we process the deposit immediately so I'm surprised that we don't do this already for stake withdraws.

@@ -105,6 +105,10 @@ module sui_system::validator_set {
is_voluntary: bool,
}

/// Key for the `extra_fields` bag to store the start epoch of allowing admission
/// of new validators based on a minimum voting power rather than a minimum stake.
public struct VotingPowerAdmissionStartEpochKey has copy, drop, store()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for positional structs we usually prefer

Suggested change
public struct VotingPowerAdmissionStartEpochKey has copy, drop, store()
public struct VotingPowerAdmissionStartEpochKey() has copy, drop, store;

vector[hint],
vector[hint],
vector[hint],
copy name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think copy is needed here. 😕

@@ -1287,11 +1352,21 @@ module sui_system::validator_set {
self.validator_candidates.contains(addr)
}

/// Returns true if `addr` is an active validator
public fun is_active_validator(self: &ValidatorSet, addr: address): bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this function should not be public.

Implement the new validator admission scheme described in SIP-39 that uses voting power thresholds rather than stake thresholds. Conceptual details are in SIP-39, but here's what changes at the code level:

- Deprecate the min_validator_joining_stake, validator_low_stake_threshold, validator_very_low_stake_threshold, and max_validator_count constants
- Use voting power-based module constants defined in `validator_set::get_voting_power_thresholds` asn the new source of truth for (min, low, very low) voting power
- Change low stake departure logic to use these voting power thresholds instead of stake thresholds
- Add tests for each of the relevant scenarios: admission with appropriate voting power, failure to admit without appropriate voting power, dismissal if very low stake threshold is hit, eventual dismissal or recovery if low stake threshold is hit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants