Skip to content

Conversation

@rescrv
Copy link
Contributor

@rescrv rescrv commented Oct 6, 2025

Description of changes

Unscientifically, minio can push 1k triggers per second with batching
that keeps the latency under one second. No need for sciencing this
one.

Test plan

Local + CI

Migration plan

N/A

Observability plan

N/A

Documentation Changes

N/A

@github-actions
Copy link

github-actions bot commented Oct 6, 2025

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@propel-code-bot
Copy link
Contributor

propel-code-bot bot commented Oct 6, 2025

Add Benchmark for s3heap and Explicit Benchmark Dependency

This PR introduces a new benchmarking tool for the s3heap Rust module, designed to measure throughput and latency of S3-backed heap operations under high load. It adds the file rust/s3heap/examples/s3heap-benchmark.rs, which launches a synthetic, highly parallelized workload that stresses the system with configurable throughput and scheduling parameters. There are also supporting changes: guacamole is added as a development dependency in rust/s3heap/Cargo.toml, and the workspace's Cargo.lock is updated to reflect this. The tool utilizes large Tokio channel buffers and batching for stress testing; code and benchmark design choices were discussed in the review and are acknowledged by the author.

Key Changes

• Added new benchmark example file rust/s3heap/examples/s3heap-benchmark.rs to provide high-throughput load simulation for s3heap.
• Introduced benchmark options for runtime, target_throughput, and max_tokio_tasks to control stress test parameters.
• Added guacamole = { version = "0.11", default-features = false } to [dev-dependencies] in rust/s3heap/Cargo.toml.
• Updated Cargo.lock to include guacamole and synchronize crates with new dependency.

Affected Areas

rust/s3heap/examples/s3heap-benchmark.rs (new benchmark file)
rust/s3heap/Cargo.toml ([dev-dependencies] section)
Cargo.lock (dependency graph)

This summary was automatically generated by @propel-code-bot

@rescrv rescrv requested a review from tanujnay112 October 6, 2025 20:45
Comment on lines +17 to +32
#[derive(Clone, Eq, PartialEq)]
pub struct Options {
pub runtime: usize,
pub target_throughput: usize,
pub max_tokio_tasks: usize,
}

impl Default for Options {
fn default() -> Self {
Options {
runtime: 60,
target_throughput: 100_000,
max_tokio_tasks: 10_000_000,
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

The design of this benchmark could be refined to provide more realistic and clearer results.

  1. Misleading Configuration and High Memory Usage: The max_tokio_tasks field in Options is used to define a very large channel capacity (over 10 million elements). This could lead to high memory consumption (potentially >600MB) if the consumer task falls behind, which might mask performance bottlenecks by simply buffering them.
  2. Ineffective Task Limit Check: The check if tasks_alive > options.max_tokio_tasks is unlikely to ever trigger since only one long-lived task is spawned.

Consider simplifying this by removing max_tokio_tasks and sizing the channel relative to the throughput. Using tx.send(...).await instead of try_send() would also introduce back-pressure, giving a better signal of the sustainable throughput of the system under test.

For example:

// In Options struct, remove max_tokio_tasks
pub struct Options {
    pub runtime: usize,
    pub target_throughput: usize,
}

// In main(), adjust channel size
let (tx, mut rx) =
    tokio::sync::mpsc::channel::<Schedule>(options.target_throughput * 2); // e.g. 2s buffer

// In the producer loop, use awaitable send and remove task check
// ...
if tx.send(Schedule { ... }).await.is_err() {
    // The receiver has been dropped, so we can stop.
    break;
}
// ...
  1. I/O in Hot Loop: The eprintln! in the consumer's hot loop can introduce I/O overhead and affect measurements. It would be better to aggregate statistics and print a summary only at the end of the run.
Context for Agents
[**BestPractice**]

The design of this benchmark could be refined to provide more realistic and clearer results.

1.  **Misleading Configuration and High Memory Usage**: The `max_tokio_tasks` field in `Options` is used to define a very large channel capacity (over 10 million elements). This could lead to high memory consumption (potentially >600MB) if the consumer task falls behind, which might mask performance bottlenecks by simply buffering them. 
2.  **Ineffective Task Limit Check**: The check `if tasks_alive > options.max_tokio_tasks` is unlikely to ever trigger since only one long-lived task is spawned.

Consider simplifying this by removing `max_tokio_tasks` and sizing the channel relative to the throughput. Using `tx.send(...).await` instead of `try_send()` would also introduce back-pressure, giving a better signal of the sustainable throughput of the system under test.

For example:

```rust
// In Options struct, remove max_tokio_tasks
pub struct Options {
    pub runtime: usize,
    pub target_throughput: usize,
}

// In main(), adjust channel size
let (tx, mut rx) =
    tokio::sync::mpsc::channel::<Schedule>(options.target_throughput * 2); // e.g. 2s buffer

// In the producer loop, use awaitable send and remove task check
// ...
if tx.send(Schedule { ... }).await.is_err() {
    // The receiver has been dropped, so we can stop.
    break;
}
// ...
```

3. **I/O in Hot Loop**: The `eprintln!` in the consumer's hot loop can introduce I/O overhead and affect measurements. It would be better to aggregate statistics and print a summary only at the end of the run.

File: rust/s3heap/examples/s3heap-benchmark.rs
Line: 32

@blacksmith-sh blacksmith-sh bot deleted a comment from rescrv Oct 6, 2025
name: "demo".to_string(),
},
nonce,
next_scheduled: Utc::now()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make a configuration of this benchmark where next_scheduled is spread out a bit more to see how we do when we don't bucket that well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. We can add that.

Copy link
Contributor

@tanujnay112 tanujnay112 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left one comment

@rescrv rescrv force-pushed the rescrv/heap-scheduler branch from 5e54c6d to 52da7a6 Compare October 7, 2025 16:19
@rescrv rescrv force-pushed the rescrv/s3heap-benchmark branch from 74bfed0 to d5312b4 Compare October 7, 2025 16:26
);
let (tx, mut rx) =
tokio::sync::mpsc::channel::<Schedule>(options.target_throughput + options.max_tokio_tasks);
let count = Arc::new(AtomicU64::new(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PerformanceOptimization]

The channel capacity is set to options.target_throughput + options.max_tokio_tasks, which is over 10 million with the default options. This will result in a channel buffer that consumes a large amount of memory (~650MB) and can lead to very large batches being processed by the consumer. This might be intentional for stress-testing, but if the goal is to simulate more frequent, smaller batches, you might consider reducing this capacity. Using just options.target_throughput would still allow for significant batching while using less memory.

Context for Agents
[**PerformanceOptimization**]

The channel capacity is set to `options.target_throughput + options.max_tokio_tasks`, which is over 10 million with the default options. This will result in a channel buffer that consumes a large amount of memory (~650MB) and can lead to very large batches being processed by the consumer. This might be intentional for stress-testing, but if the goal is to simulate more frequent, smaller batches, you might consider reducing this capacity. Using just `options.target_throughput` would still allow for significant batching while using less memory.

File: rust/s3heap/examples/s3heap-benchmark.rs
Line: 48

@blacksmith-sh blacksmith-sh bot deleted a comment from rescrv Oct 7, 2025
let mut next = Duration::ZERO;
loop {
let gap = interarrival_duration(options.target_throughput as f64)(&mut guac);
let future = interarrival_duration(1.0 / 60.0)(&mut guac);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

The variable name future is a bit confusing in an async context, as it can be mistaken for a Rust Future type. Since it represents a std::time::Duration, consider renaming it to something more descriptive like schedule_delay to improve clarity. You'll also need to update its usage on line 96.

Context for Agents
[**BestPractice**]

The variable name `future` is a bit confusing in an `async` context, as it can be mistaken for a Rust `Future` type. Since it represents a `std::time::Duration`, consider renaming it to something more descriptive like `schedule_delay` to improve clarity. You'll also need to update its usage on line 96.

File: rust/s3heap/examples/s3heap-benchmark.rs
Line: 78

@rescrv rescrv force-pushed the rescrv/heap-scheduler branch from 52da7a6 to 43edec4 Compare October 7, 2025 17:42
@rescrv rescrv force-pushed the rescrv/s3heap-benchmark branch from d4ceee4 to b18c541 Compare October 7, 2025 17:44
@rescrv rescrv force-pushed the rescrv/heap-scheduler branch 2 times, most recently from 4007de1 to 08647ab Compare October 8, 2025 15:37
@rescrv rescrv force-pushed the rescrv/heap-scheduler branch 2 times, most recently from 75ff5ab to 1de60cc Compare October 14, 2025 23:09
Unscientifically, minio can push 1k triggers per second with batching
that keeps the latency under one second.  No need for sciencing this
one.
@rescrv rescrv force-pushed the rescrv/s3heap-benchmark branch from b18c541 to 0c2442a Compare October 15, 2025 15:36
@rescrv rescrv changed the base branch from rescrv/heap-scheduler to main October 15, 2025 16:03
{
break;
}
eprintln!("HEAP::PUSH {}", buffer.len());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

For a more accurate performance measurement, it's generally better to avoid I/O operations like eprintln! inside the hot loop of a benchmark. Printing to stderr can introduce latency and skew the results. The summary statistics printed at the end of the run are sufficient for reporting.

Context for Agents
[**BestPractice**]

For a more accurate performance measurement, it's generally better to avoid I/O operations like `eprintln!` inside the hot loop of a benchmark. Printing to stderr can introduce latency and skew the results. The summary statistics printed at the end of the run are sufficient for reporting.

File: rust/s3heap/examples/s3heap-benchmark.rs
Line: 66

@blacksmith-sh blacksmith-sh bot deleted a comment from rescrv Oct 15, 2025
@rescrv rescrv merged commit 2bcd28f into main Oct 15, 2025
59 checks passed
@rescrv rescrv deleted the rescrv/s3heap-benchmark branch October 15, 2025 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants