Skip to content

Conversation

@AndreasHolt
Copy link
Contributor

What changed?

  • switch assignEphemeralShard to call a new pickLeastLoadedExecutor helper that sums smoothed shard load per executor (falling back to shard count) and log the selected target
  • cover the helper and the new load-based behavior in handler_test.go to make sure we pick the least-loaded executor and handle empty states
  • add AggregateLoad and AssignedCount tags so handler logs can show load totals when placing ephemeral shards

Why?

  • Initial placement previously picked the executor with the fewest assigned shards. Using the smoothed per-shard load lets us balance based on actual work.
  • Logging aggregated load and assignment count for every placement call gives us observability when verifying decisions.

How did you test it?

  • Added unit tests TestPickLeastLoadedExecutor and ShardNotFound_Ephemeral_LoadBased in handler_test.go to verify logic.
  • Verified that when loads are equal (or zero), the logic correctly falls back to the fewest assigned shards.

Potential risks
If shard stats are missing or stale, the aggregated load will calculate as zero. In this case, the logic degrades to the previous behavior (selecting based on shard count), minimizing the risk of bad placement.

Release notes
Ephemeral shard placement now favors the executor with the lowest smoothed load (with shard-count tie breaker) and logs the inputs for each decision.

Documentation Changes

AndreasHolt and others added 30 commits October 20, 2025 14:05
… is being reassigned in AssignShard

Signed-off-by: Andreas Holt <[email protected]>
…to not overload etcd's 128 max ops per txn

Signed-off-by: Andreas Holt <[email protected]>
…s txn and retry monotonically

Signed-off-by: Andreas Holt <[email protected]>
…shard metrics, move out to staging to separate function

Signed-off-by: Andreas Holt <[email protected]>
… And more idiomatic naming of collection vs singular type

Signed-off-by: Andreas Holt <[email protected]>
…ook more like executor key tests

Signed-off-by: Andreas Holt <[email protected]>
…ey in BuildShardKey, as we don't use it

Signed-off-by: Andreas Holt <[email protected]>
…e with new load based selection

Signed-off-by: Andreas Holt <[email protected]>
AndreasHolt and others added 15 commits November 11, 2025 15:57
…eartbeat TTL

Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Theis Randeris Mathiassen <[email protected]>
…o ewma)

Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Theis Randeris Mathiassen <[email protected]>
…t heartbeat

Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Theis Randeris Mathiassen <[email protected]>
…rdStatistics

Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Theis Randeris Mathiassen <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Theis Randeris Mathiassen <[email protected]>
Signed-off-by: Andreas Holt <[email protected]>
Signed-off-by: Theis Randeris Mathiassen <[email protected]>
Signed-off-by: Theis Randeris Mathiassen <[email protected]>
…adencefork into heartbeat-shard-statistics

Signed-off-by: Andreas Holt <[email protected]>
AndreasHolt and others added 5 commits November 19, 2025 12:21
…dd+modify tests

Modified pickLeastLoadedExecutor to skip executrs with non ACTIVE status when selecting an executor for shard assignment. Updated error message to reflect when no active executors are available. Add tests to cover this behavior. Adjust previous tests to also mock executors s.t. we get executors status

Signed-off-by: Andreas Holt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants