8369039: JDK-8348611 caused regression in Javac-Hot-Generate #27651

archiecobbs · 2025-10-06T16:03:11Z

The refactoring in JDK-8348611 caused a regression in compiler performance. In the the refactoring there were a couple of simple optimizations that were missed. When put in place, these seem to (mostly) address the performance issue.

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8369039: JDK-8348611 caused regression in Javac-Hot-Generate (Bug - P2)

Reviewers

Jan Lahoda (@lahodaj - Reviewer)
Claes Redestad (@cl4es - Reviewer)

Contributors

Claes Redestad <[email protected]>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27651/head:pull/27651
$ git checkout pull/27651

Update a local copy of the PR:
$ git checkout pull/27651
$ git pull https://git.openjdk.org/jdk.git pull/27651/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 27651

View PR using the GUI difftool:
$ git pr show -t 27651

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27651.diff

Using Webrev

Link to Webrev Comment

bridgekeeper · 2025-10-06T16:05:15Z

👋 Welcome back acobbs! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-10-06T16:06:12Z

@archiecobbs This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8369039: JDK-8348611 caused regression in Javac-Hot-Generate

Co-authored-by: Claes Redestad <[email protected]>
Reviewed-by: jlahoda, redestad

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 139 new commits pushed to the master branch:

0e5655e: 8367657: C2 SuperWord: NormalMapping demo from JVMLS 2025
1aa62dc: 8369230: com/sun/jdi/SimulResumerTest.java timed out
4d0da18: 8369250: Assess and remedy any unsafe usage of the Semaphore used by NonJavaThread::List
... and 136 more: https://git.openjdk.org/jdk/compare/9093d3a04cd2b66425cefb44de2990cb5362a29f...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk · 2025-10-06T16:07:29Z

@archiecobbs The following label will be automatically applied to this pull request:

compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2025-10-06T16:10:57Z

Webrevs

cl4es · 2025-10-07T14:09:49Z

I haven't been able to run this through our wider performance lab yet due to some issues on my end, but in local testing I'm not seeing that much of an improvement in my diagnostic tests (the tests are a bit noisy).

Running the benchmark with -prof gc shows that the regression correlates with a 17% increase in allocation pressure. This PR only reduces allocation pressure by about 1% from the regressed state:

26-b11: 15.06 GB/op
26-b12: 17.61 GB/op
pr/27651: 17.46 GB/op

Looking at hot methods I see one of the lambdas in LintMapper$FileInfo (allocated in the constructor) being relatively hot and I think the abundant use of capturing lambdas in JDK-8348611 might be cause for some unexpected allocation overheads.

For example desugaring the stream in the FileInfo constructor (which was the only thing I saw on jfr view hot-methods that seemed related to JDK-8348611):

            tree.defs.stream()
              .filter(this::isTopLevelDecl)
              .map(decl -> new Span(decl, tree.endPositions)) 
              .forEach(unmappedDecls::add);

to this:

            for (JCTree decl : tree.defs) {
                if (isTopLevelDecl(decl)) {
                    unmappedDecls.add(new Span(decl, tree.endPositions));
                }
            }

.. reduces allocation to 16.68GB/op (reducing the relative increase in allocations by about 1/3 compared to pr/27651)

archiecobbs · 2025-10-07T17:39:44Z

Hi @cl4es,

This PR only reduces allocation pressure by about 1% from the regressed state:

Now you're moving the goalposts :) I was looking at the performance benchmark, not allocation pressure...

FWIW this is what I saw on my laptop:

JDK 25

Benchmark                        (stopStage)  Mode  Cnt   Score   Error  Units
SingleJavacBenchmark.compileHot     Generate    ss   10  14.566 ± 0.868   s/op

JDK 26

Benchmark                        (stopStage)  Mode  Cnt   Score   Error  Units
SingleJavacBenchmark.compileHot     Generate    ss   10  16.403 ± 0.214   s/op

JDK 26 + this patch

Benchmark                        (stopStage)  Mode  Cnt   Score   Error  Units
SingleJavacBenchmark.compileHot     Generate    ss   10  14.807 ± 0.381   s/op

So that patch did seem to help resolve most of the benchmark difference. Are you not seeing the same thing?

But also that patch differs from the one currently committed, in that it includes the ArrayList to LinkedList changes that were later taken out. Without those changes, things get slower again:

Benchmark                        (stopStage)  Mode  Cnt   Score   Error  Units
SingleJavacBenchmark.compileHot     Generate    ss   10  15.832 ± 0.135   s/op

So surprisingly, they have an effect. So I've put them back into the PR.

For example desugaring the stream in the FileInfo constructor ... reduces allocation

<side-note>

Argh, this is frustrating. Are we supposed to use the Stream API to help improve code clarity, or avoid it to prevent slowdowns? I've seen it argued both ways...

</side-note>

Anyway, unstreaming the loop you suggested plus another one, plus the other stuff, yeilds this on my laptop:

Benchmark                        (stopStage)  Mode  Cnt   Score   Error  Units
SingleJavacBenchmark.compileHot     Generate    ss   10  14.932 ± 0.370   s/op

That is with this patch. Let me know what you see on the allocation pressure front... thanks.

cl4es · 2025-10-07T18:07:42Z

FYI I'm not a core maintainer/reviewer of javac but specialize on isolating and figuring out causes of performance regressions; you're free to keep the streams in in favor of perceived code clarity if you make the case that impact on performance is negligible.

Transient allocation pressure can be a non-issue on one system but cause a lot of GC pauses and slowdown on another, so it's one of those things I like to keep tabs on when diagnosing a regression. Ideally we shouldn't increase allocation pressure too much. On one system I was looking at the Score was way worse after 26-b12 and didn't improve that much with this PR. On your system it seems the allocation pressure isn't a large contributor to Score. YMMV.

I'll look at the numbers on your latest version.

(FWIW I ran an experiment baselined on an earlier state of this PR and posted it here archiecobbs#1 -- that showed that desugaring all the streams in LintMapper got allocation pressure down below 26-b11 levels)

archiecobbs · 2025-10-07T18:16:46Z

I ran an experiment baselined on an earlier state of this PR

Thanks, I'll include that as well. See c7d5d30.

/contributor add cl4es

openjdk · 2025-10-07T18:18:05Z

@archiecobbs cl4es was not found in the census.

Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:

/contributor add @openjdk-bot
/contributor add duke
/contributor add J. Duke <[email protected]>

User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.

archiecobbs · 2025-10-07T18:19:45Z

/contributor add @cl4es

openjdk · 2025-10-07T18:20:03Z

@archiecobbs
Contributor Claes Redestad <[email protected]> successfully added.

lahodaj · 2025-10-07T19:35:59Z

I did a quick pass through the code, and looks sensible so far. Assuming it helps with resolving the problem. I'll go through the change in more detail tomorrow.

cl4es · 2025-10-08T08:50:19Z

Latest look good on both score and allocation pressure, e.g., on my M1 MacBook:

Name       Cnt   Base   Error    Test   Error Unit  Change
compileHot  10 15.915 ± 0.094  14.463 ± 0.235 s/op   1.10x (p = 0.000*)
  * = significant

Allocations: 16.9GB/op -> 13.95GB/op

26-b11 for reference on the same system: 14,285 ± 0,379 s/op
Allocations 14.2GB/op

So more or less recuperated on score and with slightly less allocation pressure to boot 👍

liach · 2025-10-08T02:53:35Z

src/jdk.compiler/share/classes/com/sun/tools/javac/code/LintMapper.java


        final LintRange rootRange;                              // the root LintRange (covering the entire source file)
-        final List<Span> unmappedDecls = new ArrayList<>();     // unmapped top-level declarations awaiting attribution
+        final List<Span> unmappedDecls = new LinkedList<>();    // unmapped top-level declarations awaiting attribution


LinkedList has a ton of nodes, which translate to extra object headers, forward/backward pointers, and worse cache locality. To add N items, it requires O(N) allocations. In contrast, ArrayList requires O(log(N)) allocations (resizing) and is almost always better.

@archiecobbs measured a win. I haven't seen it in profiles but it could be that the code in afterAttr contribute to the case for LinkedList as it is set up to remove matching item from the list, likely at an early index. Which is one of the few things a LinkedList actually does better than an ArrayList.

While I agree ArrayLists are often better, @archiecobbs measured the LinkedLists perform better here, as @cl4es says.

I am not quite clear why linked lists are faster, but it might be many of the lists are either empty (all the children of the leaf LintRange instances will be empty lists, I think, and an empty LinkedList is, I think, cheaper than an empty ArrayList), and some of them will have only a few entries (like the unmappedDecls list here: AFAIK, this has one entry for each top-level declaration, and hence is highly unlikely to have more than 2 entries - one for the package clause, and one for the top-level class). If ArrayLists with substantial number of entries are only a small minority, it might explain why the use of LinkedLists leads to better results.

I think the right way to fix is not to use LinkedList, but to update afterAttr to use List.removeIf - this was added back in 8 to avoid the overhead of shifting from multiple removals.

On second look my last analysis was wrong. This is removing one element which is probably better done as an operation on a TreeSet, which is okay. However, I don't find why we would use a LinkedList for LintRange.

AFAIK, this has one entry for each top-level declaration, and hence is highly unlikely to have more than 2 entries - one for the package clause, and one for the top-level class

Actually it will most likely only have one entry - package declarations are not included because they can never contain @SuppressWarnings annotations. This may explain why LinkedList is faster on average.

I don't find why we would use a LinkedList for LintRange.

The same answer may apply here, but honestly I'm just guessing at this. I was also surprised that LinkedList was faster than ArrayList, but the numbers seem to be saying that.

In fact, as you point out, unmappedDecls could be a Set instead of a List. But whether that would actually help is not clear.

Actually it will most likely only have one entry - package declarations are not included because they can never contain @SuppressWarnings annotations. This may explain why LinkedList is faster on average.

Argh, ignore that, I was looking at something else. So "2" is the right number on average.

Argh, ignore that, I was looking at something else. So "2" is the right number on average.

But there's no reason for that! 99.99% of package declarations have no annotations and therefore do not need to wait for attribution. So we can get this number down to "1".

I was curious if that would make any difference (using this additional patch). When I try it on my laptop I get a slight improvement:

Benchmark (stopStage) Mode Cnt Score Error Units SingleJavacBenchmark.compileHot Generate ss 10 14.841 ± 0.618 s/op

I was curious if that would make any difference (using this additional patch). When I try it on my laptop I get a slight improvement:

Can't reallt establish any statistically significant difference either way here, neither on score or allocations.

So just to understand this a bit better I instrumented to see how common that condition is and it seems that when tree.getTag() == Tree.PACKAGEDEF then tree.annotations is never empty. At least in this benchmark. Perhaps the case when there are no annotations are skipped earlier?

So just to understand this a bit better I instrumented to see how common that condition is and it seems that when tree.getTag() == Tree.PACKAGEDEF then tree.annotations is never empty. At least in this benchmark. Perhaps the case when there are no annotations are skipped earlier?

Really? I'm not seeing that. When I apply this patch, I see size=0 unless there's actually an annotation there:

--- a/src/jdk.compiler/share/classes/com/sun/tools/javac/code/LintMapper.java +++ b/src/jdk.compiler/share/classes/com/sun/tools/javac/code/LintMapper.java @@ -179,6 +179,7 @@ private static class FileInfo { FileInfo(Lint rootLint, JCCompilationUnit tree) { rootRange = new LintRange(rootLint); for (JCTree decl : tree.defs) { +if (decl instanceof JCPackageDecl p) System.out.println("package "+p.pid+": annotations.size="+p.annotations.size()); if (isTopLevelDecl(decl)) unmappedDecls.add(new Span(decl, tree.endPositions)); } }

lahodaj

Looks reasonable to me. (I didn't run tests, though, so far.) Thanks!

archiecobbs · 2025-10-08T16:50:38Z

@lahodaj thanks for the review!

Since the current version seems to address the performance issue as reported in the bug, I'll plan to integrate later this evening unless there are objections.

cl4es

This appears to resolve most or all of the regression - thanks!

archiecobbs · 2025-10-09T01:30:25Z

/integrate

openjdk · 2025-10-09T01:32:07Z

Going to push as commit 5873c4b.
Since your change was applied there have been 139 commits pushed to the master branch:

0e5655e: 8367657: C2 SuperWord: NormalMapping demo from JVMLS 2025
1aa62dc: 8369230: com/sun/jdi/SimulResumerTest.java timed out
4d0da18: 8369250: Assess and remedy any unsafe usage of the Semaphore used by NonJavaThread::List
... and 136 more: https://git.openjdk.org/jdk/compare/9093d3a04cd2b66425cefb44de2990cb5362a29f...master

Your commit was automatically rebased without conflicts.

openjdk · 2025-10-09T01:32:29Z

@archiecobbs Pushed as commit 5873c4b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

archiecobbs added 3 commits October 4, 2025 13:39

Avoid unnecessary recursion in LintMapper.

f992660

Avoid using LintMapper when knowing rootLint suffices.

d098d01

Remove drive-by tweaks that aren\t the real issue.

02d2e73

openjdk bot added the compiler [email protected] label Oct 6, 2025

openjdk bot added the rfr Pull request is ready for review label Oct 6, 2025

archiecobbs added 2 commits October 7, 2025 12:30

Put back the drive-by tweaks; they seem to matter.

fdc907a

Unstream loops.

816162e

Do more desugaring of Stream and Optional.

c7d5d30

liach reviewed Oct 8, 2025

View reviewed changes

lahodaj approved these changes Oct 8, 2025

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Oct 8, 2025

cl4es approved these changes Oct 8, 2025

View reviewed changes

openjdk bot added the integrated Pull request has been integrated label Oct 9, 2025

openjdk bot closed this Oct 9, 2025

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 9, 2025

8369039: JDK-8348611 caused regression in Javac-Hot-Generate #27651

8369039: JDK-8348611 caused regression in Javac-Hot-Generate #27651

Uh oh!

Conversation

archiecobbs commented Oct 6, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewers

Contributors

Reviewing

Uh oh!

bridgekeeper bot commented Oct 6, 2025

Uh oh!

openjdk bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openjdk bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mlbridge bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

cl4es commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

archiecobbs commented Oct 7, 2025

Uh oh!

cl4es commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

archiecobbs commented Oct 7, 2025

Uh oh!

openjdk bot commented Oct 7, 2025

Uh oh!

archiecobbs commented Oct 7, 2025

Uh oh!

openjdk bot commented Oct 7, 2025

Uh oh!

lahodaj commented Oct 7, 2025

Uh oh!

cl4es commented Oct 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lahodaj left a comment

Choose a reason for hiding this comment

Uh oh!

archiecobbs commented Oct 8, 2025

Uh oh!

cl4es left a comment

Choose a reason for hiding this comment

Uh oh!

archiecobbs commented Oct 9, 2025

Uh oh!

openjdk bot commented Oct 9, 2025

Uh oh!

openjdk bot commented Oct 9, 2025

Uh oh!

Reviewers

Assignees

archiecobbs commented Oct 6, 2025 •

edited by openjdk bot

Loading

openjdk bot commented Oct 6, 2025 •

edited

Loading

openjdk bot commented Oct 6, 2025 •

edited

Loading

mlbridge bot commented Oct 6, 2025 •

edited

Loading

cl4es commented Oct 7, 2025 •

edited

Loading

cl4es commented Oct 7, 2025 •

edited

Loading