Skip to content

fix(epoxy): make quorum sizes match paper#4234

Open
MasterPtato wants to merge 1 commit into02-13-fix_rivetkit_stall_stop_handler_until_start_completesfrom
02-19-fix_epoxy_make_quorum_sizes_match_paper
Open

fix(epoxy): make quorum sizes match paper#4234
MasterPtato wants to merge 1 commit into02-13-fix_rivetkit_stall_stop_handler_until_start_completesfrom
02-19-fix_epoxy_make_quorum_sizes_match_paper

Conversation

@MasterPtato
Copy link
Contributor

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link

railway-app bot commented Feb 19, 2026

🚅 Deployed to the rivet-pr-4234 environment in rivet-frontend

Service Status Web Updated (UTC)
website 😴 Sleeping (View Logs) Web Feb 23, 2026 at 8:48 pm
ladle ❌ Build Failed (View Logs) Web Feb 21, 2026 at 1:41 am
frontend-inspector ❌ Build Failed (View Logs) Web Feb 19, 2026 at 8:48 pm
mcp-hub ✅ Success (View Logs) Web Feb 19, 2026 at 8:48 pm
frontend-cloud ❌ Build Failed (View Logs) Web Feb 19, 2026 at 8:48 pm

Copy link
Contributor Author

MasterPtato commented Feb 19, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Feb 19, 2026

Code Review: fix(epoxy): make quorum sizes match paper

The core idea is solid — separating the "sender-excluded" fanout quorum calculation out of the HTTP client and into a dedicated calculate_fanout_quorum function is a clear improvement in separation of concerns. The old code's blanket quorum_size - 1 subtraction in the Fast | Slow arm was both ad-hoc and only correct for n >= 2.

Issues

utils.rs still has a glob import (CLAUDE.md violation)

The PR correctly replaces use anyhow::*; in http_client.rs, but utils.rs still has it at line 1:

// utils.rs line 1 — unchanged
use anyhow::*;

This should be updated to explicit imports (e.g., use anyhow::{bail, Result};) to match project style.


Missing unit tests for the quorum calculation logic

The PR fixes a correctness bug in consensus quorum sizing — this is exactly the kind of logic that should have explicit unit tests. Without them, it's very hard to verify the edge-case behavior or catch regressions. At a minimum, the following should be tested:

  • calculate_quorum for n = 0, 1, 2, 3, 5 for each QuorumType
  • calculate_fanout_quorum for the same inputs
  • That calculate_fanout_quorum(n, q) == calculate_quorum(n, q) - 1 holds for n >= 3 (at least for non-Any types)

The relationship between calculate_quorum and calculate_fanout_quorum is also currently implicit. A test explicitly asserting fanout_quorum == quorum - 1 for the general case would make the intent self-documenting.


calculate_fanout_quorum n=2 ignores QuorumType

The n=2 arm hardcodes to 1 regardless of q:

2 => 1,

For QuorumType::Any with n=2, calculate_quorum(2, Any) = 1, so fanout = 0 (sender alone satisfies quorum). Returning 1 instead means we'll wait for the other node unnecessarily. Whether this matters in practice depends on whether Any is ever used with n=2, but a comment explaining why q is ignored for this case would help.


warn\! for zero fanout target will fire on every operation in single-node clusters

if target_responses == 0 {
    tracing::warn\!("no fanout, target is 0");
    return Ok(Vec::new());
}

calculate_fanout_quorum(1, _) always returns 0, so any single-node cluster will emit a warn on every fanout call. If single-node deployments are a supported configuration, this will be very noisy. Consider tracing::debug\! for this path, or better yet, prevent callers from invoking fanout at all when n=1 and document that as a precondition.


Quorum formula for even N — worth a comment or test

The comment explains that even N gives no additional fault tolerance (N=4 acts like N=3), but the resulting quorum sizes are noticeably smaller than a traditional majority. For example:

N f Slow quorum (code) Traditional majority
4 1 2 3
6 2 3 4

If EPaxos section 4.3 justifies this for even N clusters, a brief inline comment with that citation (beyond just "See EPaxos 4.3") would make the reasoning more accessible. A table-driven test would also make the expected values auditable.

Minor

  • The new debug log tracing::debug\!(?target_responses, len=?responses.len(), "fanout target") drops quorum_type from the call site. It's still captured in the parent span's fields so this isn't a loss in practice, but the spacing around len=? doesn't match the project's tracing conventions elsewhere (compare to len = ?responses.len() in the old code).

Summary

The directional change is correct and the logic refactoring is an improvement. Main asks before merging:

  1. Fix the glob import in utils.rs
  2. Add unit tests for both quorum functions covering edge cases
  3. Address the warn\! log level for single-node fanout

@pkg-pr-new
Copy link

pkg-pr-new bot commented Feb 19, 2026

More templates

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4234

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4234

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4234

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4234

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4234

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4234

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4234

@rivetkit/sqlite-vfs

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sqlite-vfs@4234

@rivetkit/traces

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/traces@4234

@rivetkit/workflow-engine

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/workflow-engine@4234

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4234

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4234

commit: 5973cf6

@MasterPtato MasterPtato force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from 788e9db to e7553c7 Compare February 19, 2026 21:14
@MasterPtato MasterPtato marked this pull request as ready for review February 19, 2026 21:14
@MasterPtato MasterPtato force-pushed the 02-17-fix_gas_gracefully_handle_corrupt_wf branch from c7242fc to e418d91 Compare February 19, 2026 22:41
@MasterPtato MasterPtato force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from e7553c7 to 710de07 Compare February 19, 2026 22:41
@MasterPtato MasterPtato force-pushed the 02-17-fix_gas_gracefully_handle_corrupt_wf branch from e418d91 to 0e4f8dc Compare February 19, 2026 22:51
@MasterPtato MasterPtato force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from 710de07 to fbd667c Compare February 19, 2026 22:53
@MasterPtato MasterPtato force-pushed the 02-17-fix_gas_gracefully_handle_corrupt_wf branch from 0e4f8dc to 233582b Compare February 19, 2026 22:53
@graphite-app graphite-app bot changed the base branch from 02-17-fix_gas_gracefully_handle_corrupt_wf to graphite-base/4234 February 19, 2026 22:55
@graphite-app graphite-app bot force-pushed the graphite-base/4234 branch from 233582b to 682f215 Compare February 19, 2026 22:56
@graphite-app graphite-app bot force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from fbd667c to 6b4b34d Compare February 19, 2026 22:56
@graphite-app graphite-app bot changed the base branch from graphite-base/4234 to main February 19, 2026 22:57
@graphite-app graphite-app bot force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from 6b4b34d to 5973cf6 Compare February 19, 2026 22:57
@MasterPtato MasterPtato force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from 5973cf6 to e86f64c Compare February 20, 2026 00:04
@MasterPtato MasterPtato force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from e86f64c to a4fe6de Compare February 20, 2026 00:16
@MasterPtato MasterPtato force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from a4fe6de to 754e84e Compare February 21, 2026 01:40
@MasterPtato MasterPtato force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from 754e84e to 4e3fd8f Compare February 26, 2026 01:12
@MasterPtato MasterPtato force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from 4e3fd8f to e144ca2 Compare February 26, 2026 02:14
@MasterPtato MasterPtato changed the base branch from main to graphite-base/4234 February 26, 2026 19:42
@MasterPtato MasterPtato force-pushed the 02-19-fix_epoxy_make_quorum_sizes_match_paper branch from e144ca2 to f01fcb4 Compare February 26, 2026 19:42
@MasterPtato MasterPtato changed the base branch from graphite-base/4234 to 02-13-fix_rivetkit_stall_stop_handler_until_start_completes February 26, 2026 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant