bench_tools: add absolute limits to benchmark runs #9861

AvivYossef-starkware · 2025-10-30T09:23:22Z

No description provided.

reviewable-StarkWare · 2025-10-30T09:23:31Z

This change is

AvivYossef-starkware · 2025-10-30T09:23:42Z

bench_tools: add absolute limits to benchmark runs #9861 : 3 dependent PRs (#9526 , #9874 , #9876 ) 👈 (View in Graphite)
bench_tools: dont capture output when running a benchmark #9860
ci: use benchtools to benchmark the committer #9707
bench_tools: fix committer benchmark config #9807
bench_tools: run and compare benchmark #9699
bench_tools: support deserialization of criterion change #9698
bench_tools: save specific benchmark result #9742
bench_tools: add criterion benchmark names to benchmark config #9697
bench_tools: add bench tools to allowed scopes #9696
ci: run benchmark command #9636
infra: copy dir rec #9802
main-v0.14.1

This stack of pull requests is managed by Graphite. Learn more about stacking.

avi-starkware

@avi-starkware reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @AvivYossef-starkware)

crates/bench_tools/src/comparison.rs line 77 at r1 (raw file):

/// If any benchmark exceeds the regression limit or absolute time threshold, returns an error with
/// detailed results. Panics if change file is not found for any benchmark.
pub fn check_regressions(

Add a test where the absolute time threshold is reached

Code quote:

pub fn check_regressions(

crates/bench_tools/src/utils.rs line 71 at r1 (raw file):

/// # Panics
/// Panics if any limit value cannot be parsed as f64.
pub fn parse_absolute_time_limits(args: Vec<String>) -> HashMap<String, f64> {

Add tests
Why not make it pub(crate)?

Code quote:

pub fn parse_absolute_time_limits(args: Vec<String>) -> HashMap<String, f64> {

crates/bench_tools/src/utils.rs line 74 at r1 (raw file):

    let mut limits = HashMap::new();
    for chunk in args.chunks(2) {
        if chunk.len() == 2 {

If the input has an odd number of elements, then the last element will be silently ignored. I think there should be at least a warning in this case.

Code quote:

    for chunk in args.chunks(2) {
        if chunk.len() == 2 {

crates/bench_tools/src/runner.rs line 149 at r1 (raw file):

                    if result.exceeds_regression_limit {
                        println!(
                            "❌ {}: {:+.2}% (EXCEEDS {:.1}% limit)",

To make the format consistent

Suggestion:

                            " ❌ {}: {:+.2}% (EXCEEDS {:.1}% limit)",

crates/bench_tools/src/runner.rs line 161 at r1 (raw file):

                        }
                    }
                    println!();

Suggestion:

                    if result.exceeds_regression_limit {
                        println!(
                            "❌ {}: {:+.2}% (EXCEEDS {:.1}% limit)",
                            result.name, result.change_percentage, regression_limit
                        );
                    }
                    else if result.exceeds_absolute_limit {
                        if let Some(&limit) = absolute_time_ns_limits.get(&result.name) {
                            println!(
                                " ❌ {}: {:.2}ns (EXCEEDS {:.0}ns limit)",
                                result.name, result.absolute_time_ns, limit
                            );
                        }
                    }

AvivYossef-starkware

Reviewable status: 3 of 6 files reviewed, 4 unresolved discussions (waiting on @avi-starkware)

crates/bench_tools/src/comparison.rs line 77 at r1 (raw file):

Previously, avi-starkware (Avi Cohen) wrote…

Add a test where the absolute time threshold is reached

I tested it manually with #9526
Do you want me to add unitests for it?

crates/bench_tools/src/utils.rs line 71 at r1 (raw file):

Previously, avi-starkware (Avi Cohen) wrote…

Add tests

Why not make it pub(crate)?

added
I use it in the main of the binary; it doesn't have access to the pub (crate) lib utils.

crates/bench_tools/src/utils.rs line 74 at r1 (raw file):

Previously, avi-starkware (Avi Cohen) wrote…

If the input has an odd number of elements, then the last element will be silently ignored. I think there should be at least a warning in this case.

You are right, I added an assert

crates/bench_tools/src/runner.rs line 161 at r1 (raw file):

                        }
                    }
                    println!();

why?
I want to print both if both exceed the limit

avi-starkware

@avi-starkware reviewed 3 of 3 files at r2, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @AvivYossef-starkware)

crates/bench_tools/src/comparison.rs line 77 at r1 (raw file):

Previously, AvivYossef-starkware wrote…

I tested it manually with #9526
Do you want me to add unitests for it?

I think we should unit test both limits (can be in another PR).
No need to actually run benchmarks for these tests, just generate the mock estimates and run the function check_regressions.

crates/bench_tools/src/runner.rs line 161 at r1 (raw file):

Previously, AvivYossef-starkware wrote…

why?
I want to print both if both exceed the limit

Oh okay... I missed that flow...
So I think it would be better to put the happy flow as the first case, and then there will be no nested ifs.
Additionally, I think you can remove the println!(); at the end.

Non-blocking

avi-starkware

@avi-starkware reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @AvivYossef-starkware)

AvivYossef-starkware

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @avi-starkware)

crates/bench_tools/src/comparison.rs line 77 at r1 (raw file):

Previously, avi-starkware (Avi Cohen) wrote…

I think we should unit test both limits (can be in another PR).
No need to actually run benchmarks for these tests, just generate the mock estimates and run the function check_regressions.

I'll do it in a different PR

crates/bench_tools/src/runner.rs line 161 at r1 (raw file):

Previously, avi-starkware (Avi Cohen) wrote…

Oh okay... I missed that flow...
So I think it would be better to put the happy flow as the first case, and then there will be no nested ifs.
Additionally, I think you can remove the println!(); at the end.

Non-blocking

I dont understand why happy flow as the first case would help avoiding the nested if.

This was referenced Oct 30, 2025

infra: copy dir rec #9802

Merged

bench_tools: fix committer benchmark config #9807

Merged

bench_tools: dont capture output when running a benchmark #9860

Merged

AvivYossef-starkware force-pushed the aviv/add_absolute_limits_to_benchmark_run branch from f0f11db to f08bafc Compare October 30, 2025 10:50

AvivYossef-starkware force-pushed the aviv/dont_capture_output_when_rinnung_benchmark branch from 69b8d38 to e4891f5 Compare October 30, 2025 10:50

AvivYossef-starkware requested a review from avi-starkware October 30, 2025 12:50

AvivYossef-starkware marked this pull request as ready for review October 30, 2025 12:50

AvivYossef-starkware force-pushed the aviv/add_absolute_limits_to_benchmark_run branch from f08bafc to d505305 Compare October 30, 2025 12:59

AvivYossef-starkware force-pushed the aviv/dont_capture_output_when_rinnung_benchmark branch from e4891f5 to b246fe1 Compare October 30, 2025 12:59

AvivYossef-starkware changed the base branch from aviv/dont_capture_output_when_rinnung_benchmark to graphite-base/9861 October 30, 2025 13:03

AvivYossef-starkware force-pushed the aviv/add_absolute_limits_to_benchmark_run branch from d505305 to 154ad3f Compare October 30, 2025 13:43

AvivYossef-starkware force-pushed the graphite-base/9861 branch from b246fe1 to 095380f Compare October 30, 2025 13:44

AvivYossef-starkware changed the base branch from graphite-base/9861 to aviv/dont_capture_output_when_rinnung_benchmark October 30, 2025 13:44

AvivYossef-starkware mentioned this pull request Oct 30, 2025

bench_tools: add readme #9874

Merged

AvivYossef-starkware changed the base branch from aviv/dont_capture_output_when_rinnung_benchmark to main-v0.14.1 October 30, 2025 13:53

AvivYossef-starkware force-pushed the aviv/add_absolute_limits_to_benchmark_run branch 2 times, most recently from 0ec0ba0 to 1a0388f Compare October 30, 2025 14:08

AvivYossef-starkware mentioned this pull request Oct 30, 2025

ci: limit committer benchmark absolute time #9876

Merged

AvivYossef-starkware force-pushed the aviv/add_absolute_limits_to_benchmark_run branch from 1a0388f to eba6a02 Compare November 2, 2025 09:04

avi-starkware requested changes Nov 2, 2025

View reviewed changes

AvivYossef-starkware force-pushed the aviv/add_absolute_limits_to_benchmark_run branch from eba6a02 to 2207e68 Compare November 4, 2025 07:36

AvivYossef-starkware commented Nov 4, 2025

View reviewed changes

avi-starkware approved these changes Nov 4, 2025

View reviewed changes

bench_tools: add absolute limits to benchmark runs

c28ef50

AvivYossef-starkware force-pushed the aviv/add_absolute_limits_to_benchmark_run branch from 2207e68 to c28ef50 Compare November 4, 2025 13:49

avi-starkware approved these changes Nov 4, 2025

View reviewed changes

AvivYossef-starkware commented Nov 5, 2025

View reviewed changes

AvivYossef-starkware added this pull request to the merge queue Nov 5, 2025

Merged via the queue into main-v0.14.1 with commit d2daf94 Nov 5, 2025
14 checks passed

github-actions bot locked and limited conversation to collaborators Nov 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bench_tools: add absolute limits to benchmark runs #9861

bench_tools: add absolute limits to benchmark runs #9861

Uh oh!

AvivYossef-starkware commented Oct 30, 2025

Uh oh!

reviewable-StarkWare commented Oct 30, 2025

Uh oh!

AvivYossef-starkware commented Oct 30, 2025 •

edited

Loading

Uh oh!

avi-starkware left a comment

Uh oh!

AvivYossef-starkware left a comment

Uh oh!

avi-starkware left a comment

Uh oh!

avi-starkware left a comment

Uh oh!

AvivYossef-starkware left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bench_tools: add absolute limits to benchmark runs #9861

bench_tools: add absolute limits to benchmark runs #9861

Uh oh!

Conversation

AvivYossef-starkware commented Oct 30, 2025

Uh oh!

reviewable-StarkWare commented Oct 30, 2025

Uh oh!

AvivYossef-starkware commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

avi-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

AvivYossef-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

avi-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

avi-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

AvivYossef-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AvivYossef-starkware commented Oct 30, 2025 •

edited

Loading