Add a benchmark that does a better job of exploring differences between iteration approaches #4

marshallpierce · 2015-11-25T22:00:52Z

The current benchmark (as you've seen in the comments on the blog post) is quite flawed: it's mostly a test of the compiler's ability to remove bounds checks and boxing.

This adds another benchmark that avoids these issues:

It has fast but nontrivial work at for each array element (computing hashCode() of byte[])
Doesn't use volatile on fields
Uses primitive streams where applicable

This is more representative of realworld workloads, I would claim. It also shows results much more inline with what makes sense, namely that they're all about the same except for parallel getting a good speedup.

ByteArrayLoopBenchmark.forEachLambdaMax            avgt   10  10.668 ± 0.161  ms/op
ByteArrayLoopBenchmark.forEachLoopMax              avgt   10  10.513 ± 0.008  ms/op
ByteArrayLoopBenchmark.forMax                      avgt   10  10.578 ± 0.073  ms/op
ByteArrayLoopBenchmark.iteratorMax                 avgt   10  10.507 ± 0.094  ms/op
ByteArrayLoopBenchmark.parallelStreamMax           avgt   10   1.919 ± 0.124  ms/op
ByteArrayLoopBenchmark.streamMax                   avgt   10  10.601 ± 0.153  ms/op
ByteArrayLoopBenchmark.streamReduceMax             avgt   10  10.695 ± 0.295  ms/op
ByteArrayLoopBenchmark.streamReduceMaxWithInitial  avgt   10  10.505 ± 0.149  ms/op

As further color on the volatile issue, your original benchmark produces this:

LoopBenchmarkMain.forEachLambdaMaxInteger          avgt   10   0.513 ± 0.034  ms/op
LoopBenchmarkMain.forEachLoopMaxInteger            avgt   10   0.123 ± 0.006  ms/op
LoopBenchmarkMain.forMaxInteger                    avgt   10   0.221 ± 0.043  ms/op
LoopBenchmarkMain.iteratorMaxInteger               avgt   10   0.104 ± 0.003  ms/op
LoopBenchmarkMain.lambdaMaxInteger                 avgt   10   0.468 ± 0.018  ms/op
LoopBenchmarkMain.parallelStreamMaxInteger         avgt   10   0.208 ± 0.028  ms/op
LoopBenchmarkMain.streamMaxInteger                 avgt   10   0.588 ± 0.020  ms/op

Just by removing the volatile modifier, forMaxInteger becomes the fastest, as is expected:

LoopBenchmarkMain.forEachLambdaMaxInteger   avgt   10  0.510 ± 0.018  ms/op
LoopBenchmarkMain.forEachLoopMaxInteger     avgt   10  0.123 ± 0.003  ms/op
LoopBenchmarkMain.forMaxInteger             avgt   10  0.099 ± 0.002  ms/op
LoopBenchmarkMain.iteratorMaxInteger        avgt   10  0.112 ± 0.015  ms/op
LoopBenchmarkMain.lambdaMaxInteger          avgt   10  0.466 ± 0.013  ms/op
LoopBenchmarkMain.parallelStreamMaxInteger  avgt   10  0.207 ± 0.014  ms/op
LoopBenchmarkMain.streamMaxInteger          avgt   10  0.579 ± 0.012  ms/op

…en iteration approaches: less prone to confusing results due to compiler's ability (or inability) to remove boxing

bsideup · 2015-11-26T09:45:18Z

src/main/java/com/takipi/oss/benchmarks/jmh/loops/ByteArrayLoopBenchmark.java

Use seed. Otherwise your test is flaky as hell :)

The original test didn't provide a seed, so I didn't care to change that without a good reason. That said, if you can provide a good motivation, I'm game... What effect do you see providing a seed having? Either way, Arrays.hashCode() will have to traverse the full contents of every byte[], and the math in Math.max() will be unpredictable no matter what. If your point is simply "it could be predictable, but it's not right now", I agree with that -- I just don't see it actually having much of an effect, especially given that this setup runs once per iteration, not once per trial, so from the CPU's perspective it's going to run for a few billion cycles on the exact same set of data each time.

Compare "max" algorithm for this data sets :)

1 2 3 4 5 6 7 8 9 10
1 1 1 1 1 1 1 1 1 1
10 9 8 7 6 5 4 3 2 1
3 2 5 1 2 6 3 8 7 6 5

Now imagine that each benchmark case will get it's own random sequence. And sequence for one could be much simpler compared to another

Fair enough. I still think that's pretty unlikely to ever occur, but I also don't ever want to worry about it. :) After all, "pretty unlikely" isn't good enough for thread safety!

…for catching it

Add a benchmark that does a better job of exploring differences betwe…

f8b6c23

…en iteration approaches: less prone to confusing results due to compiler's ability (or inability) to remove boxing

bsideup reviewed Nov 26, 2015
View reviewed changes

Use a seed to make pseudorandom data predictable. Hat tip to @bsideup …

bcf33e3

…for catching it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add a benchmark that does a better job of exploring differences between iteration approaches #4

Add a benchmark that does a better job of exploring differences between iteration approaches #4

Uh oh!

marshallpierce commented Nov 25, 2015

Uh oh!

bsideup Nov 26, 2015

Uh oh!

marshallpierce Nov 26, 2015

Uh oh!

bsideup Nov 26, 2015

Uh oh!

marshallpierce Nov 26, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add a benchmark that does a better job of exploring differences between iteration approaches #4

Are you sure you want to change the base?

Add a benchmark that does a better job of exploring differences between iteration approaches #4

Uh oh!

Conversation

marshallpierce commented Nov 25, 2015

Uh oh!

bsideup Nov 26, 2015

Choose a reason for hiding this comment

Uh oh!

marshallpierce Nov 26, 2015

Choose a reason for hiding this comment

Uh oh!

bsideup Nov 26, 2015

Choose a reason for hiding this comment

Uh oh!

marshallpierce Nov 26, 2015

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants