Skip to content

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Jan 2, 2026

Which issue does this PR close?

Closes #.

Rationale for this change

The existing benchmarks were minimal.

What changes are included in this PR?

Improvements Made

  1. Improved Test Data (prepareTestTable)
  • c1: Random long values (full range) - original data
  • c2: Values 0-99 (ABS(value % 100)) - ensures even distribution across branches for multi-branch tests
  • c3: Secondary long column (value * 2) - for non-literal result expressions
  • c4: String column - for potential string result tests
  1. New Benchmark Cases (8 total, up from 2)
Benchmark Description
caseWhenLiteralBenchmark 3-branch CASE WHEN with string literal results
caseWhenManyBranchesLiteralBenchmark 10-branch CASE WHEN with string literal results
caseWhenColumnResultBenchmark 3-branch CASE WHEN with column/expression results
caseWhenManyBranchesColumnResultBenchmark 10-branch CASE WHEN with column/expression results
ifLiteralBenchmark Simple IF with string literal results
ifColumnResultBenchmark Simple IF with column/expression results
nestedIfBenchmark Nested IF (4 outcomes) with string literals
nestedIfColumnResultBenchmark Nested IF (4 outcomes) with column/expression results
  1. Key Test Dimensions
  • Branch count: 3 vs 10 branches (tests scalability)
  • Result type: Literals vs column expressions (tests materialization overhead)
  • Expression style: CASE WHEN vs IF vs nested IF

The benchmark now provides much better coverage of real-world conditional expression patterns.

How are these changes tested?

@andygrove
Copy link
Member Author

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
Case When Literal (3 branches):           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                48             52           2         21.9          45.7       1.0X
Comet (Scan)                                         46             51           3         22.7          44.1       1.0X
Comet (Scan + Exec)                                  57             60           2         18.4          54.3       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
Case When Literal (10 branches):          Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                48             51           1         21.6          46.2       1.0X
Comet (Scan)                                         48             49           1         21.8          45.8       1.0X
Comet (Scan + Exec)                                  67             69           1         15.8          63.4       0.7X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
Case When Column Result (3 branches):     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                37             40           1         28.0          35.7       1.0X
Comet (Scan)                                         39             42           4         26.9          37.1       1.0X
Comet (Scan + Exec)                                  45             47           3         23.6          42.4       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
Case When Column Result (10 branches):    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                55             56           1         19.2          52.0       1.0X
Comet (Scan)                                         47             50           2         22.3          44.8       1.2X
Comet (Scan + Exec)                                  73             76           1         14.3          69.9       0.7X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
If Literal:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                40             43           4         25.9          38.6       1.0X
Comet (Scan)                                         40             42           2         26.1          38.2       1.0X
Comet (Scan + Exec)                                  45             46           1         23.3          42.9       0.9X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
If Column Result:                         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                33             35           1         31.8          31.4       1.0X
Comet (Scan)                                         33             35           1         31.7          31.5       1.0X
Comet (Scan + Exec)                                  39             41           2         26.8          37.3       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
Nested If Literal (4 outcomes):           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                40             43           2         26.0          38.5       1.0X
Comet (Scan)                                         41             42           1         25.8          38.8       1.0X
Comet (Scan + Exec)                                  52             55           1         20.0          49.9       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
Nested If Column Result (4 outcomes):     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                42             45           2         24.9          40.2       1.0X
Comet (Scan)                                         42             44           2         24.8          40.4       1.0X
Comet (Scan + Exec)                                  50             52           2         21.1          47.5       0.8X

@codecov-commenter
Copy link

codecov-commenter commented Jan 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.55%. Comparing base (f09f8af) to head (d53ff5c).
⚠️ Report is 811 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3024      +/-   ##
============================================
+ Coverage     56.12%   59.55%   +3.43%     
- Complexity      976     1379     +403     
============================================
  Files           119      167      +48     
  Lines         11743    15496    +3753     
  Branches       2251     2569     +318     
============================================
+ Hits           6591     9229    +2638     
- Misses         4012     4970     +958     
- Partials       1140     1297     +157     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comet being slower with Exec enabled is alarming, but gotta start somewhere by measuring it. Thanks @andygrove!

@andygrove andygrove merged commit e041a1e into apache:main Jan 6, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants