Skip to content

Conversation

@asotona
Copy link
Member

@asotona asotona commented Nov 12, 2025

This PR include following changes:

  • op-building methods delegate to a synthetic inner class
  • fixed boxing in OpBuilder
  • op-building methods are generated by BytecodeGenerator and support wide range of ops
  • CodeModelTranslator is deleted
  • synthetic op-building method overrides significantly reduce overhead (by 60% on TestBytecode)
  • refactored OpBuilder to build ModuleOp instead of individual FuncCallOps
  • fixed BytecodeGenerator to support ModuleOp and FuncCallOp

Progress

  • Change must not contain extraneous whitespace

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/babylon.git pull/679/head:pull/679
$ git checkout pull/679

Update a local copy of the PR:
$ git checkout pull/679
$ git pull https://git.openjdk.org/babylon.git pull/679/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 679

View PR using the GUI difftool:
$ git pr show -t 679

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/babylon/pull/679.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 12, 2025

👋 Welcome back asotona! A progress list of the required criteria for merging this PR into code-reflection will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 12, 2025

@asotona This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

OpBuilder optimizations - part 1.

Reviewed-by: mabbay

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the code-reflection branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the code-reflection branch, type /integrate in a new comment.

@openjdk openjdk bot added ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 12, 2025
@mlbridge
Copy link

mlbridge bot commented Nov 12, 2025

Copy link
Member

@PaulSandoz PaulSandoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks promising, and we get to leverage more of what we have built, so we can further test and improve it. We need to get reflectable lambdas working and it passing all tests before its ready.

}

OpBuilder(Function<Block.Builder, Value> dialectFactoryF) {
public static List<FuncOp> createSupportFunctions(JavaType currentClass) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we could declare a reflectable methods and let the compiler build the models for us, not sure that is possible at this stage. Placing the corresponding Java code in a comment would be the next best thing.

return new MethodSymbol(PRIVATE | STATIC | SYNTHETIC, methodName, mt, synthClassSym);
}

private Type synthClassDecl(String className, List<CoreOp.FuncOp> funcs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, you cleverly side-step the restriction we could not work out how to overcome, by generating a "synthetic" nested static class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, generating the .class file here is not great. Basically, javac compilation happens in stages, and the classfile generation is the last stage. Typically if there's errors, or if the user has selected specific compilation policies, we might stop at a certain phase w/o generating bytecode -- but since here we're generating bytecode in the middle of a "lowering" step, we end up violating these constraints.

The right way to do things here would be to generate a class AST in ReflectMethods (e.g. like Lower does in some cases -- e.g. for private constructors). I suspect that, to do that, you might need to resurrect the logic that goes from the func ops down to an AST (which this PR deletes).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, too clever perhaps :-) I suspect it is easy to enhance CodeModelTranslator to support the generation of multiple AST methods from func ops, and on a case by case basis enhanced to support the translation of additional ops we need to use (e.g., conditional expression op to the ? : operator node).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I found CodeModelTranslator only supports very limited set of ops, no nested bodies and no more than a single entry block per method. I expect getting it at least to the shape of BytecodeGenerator would require non-trivial amount of work.

Copy link
Member

@PaulSandoz PaulSandoz Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that were the goal then I agree it would certainly be a non-trivial amount of work. We have a more modest goal - ensuring that a code model for a reasonably large reflectable method/lambda can be encoded using our current mechanism without breaking class file limits. We need to be good enough, and this unlikely to be the final solution as to how we store the code model.

Looking at the OpBuilder changes in createSupportFunctions I suspect that the enhancements required to CodeModelTranslator are manageable e.g., support for multiple methods, and the ? : operator. @mabbay wrote CodeModelTranslator and he may have a sense of how easy it would be to enhance with support for additional ops.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an option.
I just don't see much of added value when we manually hard-code proprietary code models and then translate them with proprietary translator to ASTs. BytecodeGenerator here revealed bugs in the models, and on the other side it revealed blind spots in the BytecodeGenerator itself.
It would make more sense to hard-code it in the ASTs then.
For example the tableswitch instruction I would like to look at it in the BytecodeGenerator have the same problem in the CodeModelTranslator.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option we discussed offline, in case we want to keep things more fluid, would be to make this trick more official. That is, during ReflectMethods, we generate some ops, we save them on the AST, then, when the time comes to generate bytecode (JavaCompiler::generate) we pick these up again, and we run them through the BytecodeGenerator. This will at least respect the order in which things should occur.

One thing that is related is what is the role of OpBuilder going forward, because it's looking now like a very specialized piece of code that probably belongs more near to ReflectMethods than as a standalone API?

And, if that's true, perhaps once the op method format settles, we can do a pass on the code and just generate the AST we want directly, w/o going through code models...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can represent the building of the model purely as a model it gives us flexibility to translate and test e.g., testing using the interpreter or bytecode generator.

The modest set of models we build by hand, and fit into the larger model, could in theory eventually be built by the compiler itself. Let's try not to short circuit this and keep pushing. The key abstraction is the translation of a body with one block in some context, we already support it for one func op in the context of a JCMethodDecl that has one block node , we can extend for more than one func op in a module op, and then extend to support ConditionalExpressionOp that has three bodies, in the context of JCConditional that has three expression nodes.

I am glad as we have explored this area we have found limitations, that's valuable and we should find solutions for those regardless. So even if cannot use BytecodeGenerator in this case it will almost certain be used in other cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option we discussed offline, in case we want to keep things more fluid, would be to make this trick more official. That is, during ReflectMethods, we generate some ops, we save them on the AST, then, when the time comes to generate bytecode (JavaCompiler::generate) we pick these up again, and we run them through the BytecodeGenerator. This will at least respect the order in which things should occur.

I would prefer this approach if feasible. That would allow us cleanly to exercise all the layers.

One thing that is related is what is the role of OpBuilder going forward, because it's looking now like a very specialized piece of code that probably belongs more near to ReflectMethods than as a standalone API?

It's in the same package as ReflectMethods (at one point it might have been external, don't recall exactly).

And, if that's true, perhaps once the op method format settles, we can do a pass on the code and just generate the AST we want directly, w/o going through code models...

Yes, that's a possibility to collapse the layers, although maybe a little harder to test independently.

* @modules jdk.incubator.code
* @modules jdk.incubator.code/jdk.incubator.code.internal
* @run junit TestCodeBuilder
* @ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a temporary restriction? Until you get to part 2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because I didn't pack all the FuncOps into a ModuleOp, so they could call each other.
Yes, it is expected as a part 2.

try {
// @@@ Use method handle with full power mode
opMethod = method.getDeclaringClass().getDeclaredMethod(opMethodName);
var cls = Stream.of(method.getDeclaringClass().getDeclaredClasses()).filter(c -> c.getName().endsWith("$$CM")).findFirst();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize you are trying to avoid generating any code using the tree API but i think it still may be best to keep generating the synthetic method that produces the code model. That synthetic method hides the details without having to change the runtime code allowing us to experiment with various encodings e.g., we could decide to generate one nested class per reflectable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generates the synthetic methods, just into an internal class.
BytecodeGenerator produces full class (with constant pool) and it does not mix match with ASTs.
I've tried also an experiment to inject Code attribute into the synth method, however complexity of interoperation between existing bytecode and ASTs is horrible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All i am saying here is the details about the nested class can be embedded in the synthetic method that invokes the static method of the same name on the nested class. Thereby we hide those details from the runtime.

@mcimadamore
Copy link
Collaborator

The overall approach seems solid. You generate a separate class that contains some helper methods to build lists/maps and ops. Presumably, these methods have a very high chance to be reused across different synthetic op methods.

I wonder if these "helper" functions could be turned into bootstrap methods. That way, we can implement them once and for all in the incubator module. And we could even add more, such as the ability to turn a Java type string into a type element w/o building the tree structurally.

Re. helper method names, perhaps would be better to start with $ instead of ::, as that is a more common symbol to indicate "synthetic method" in javac-land.

@asotona
Copy link
Member Author

asotona commented Nov 14, 2025

Providing fixed public set of bootstrap methods forms an API, which (as I understand) we are trying to avoid.
Beside that the planned second phase (an index-based types builder) must be specific per-use case (technically per class holding one or more code models).

@asotona
Copy link
Member Author

asotona commented Nov 14, 2025

Based on the discussions above (and offline) I propose to refactor this solution a bit:

  • Model-building FuncOps + helpers FuncOps will be organized under self-contained ModuleOp (forming the full content of the synthetic class). Helper methods will be called as FuncCallOp, instead of InvokeOp. This configuration allows interpretation of the of model-building FuncOps and so better testability (re-enabling of the TestCodeBuilder). It also eliminates some issues with lookup of the actual class.
  • Model-building methods will return to the parent class, however they will just delegate to the same-named methods in the synthetic class.

@asotona
Copy link
Member Author

asotona commented Nov 14, 2025

Related to the BytecodeGenerator postpone to the rigth javac phase - I'm open to the implementation proposals.
Maybe we can solve it in a follow-up PR.

@mcimadamore
Copy link
Collaborator

Related to the BytecodeGenerator postpone to the rigth javac phase - I'm open to the implementation proposals. Maybe we can solve it in a follow-up PR.

I think it's better to defer to a different PR. We need to find the correct way to communicate the ops to the compiler pipeline. Probably the best way to do this would be the creation of "shim" AST ClassDef nodes, which have nested shim MethodDef, as needed. These method defs will not have a body, but will have an "op" instead. Then Gen will recognize these, and skip the visit. And ClassWriter can probably use BytecodeGenerator to emit the bytecode for these methods in the context of the enclosing class that it will generate.

@PaulSandoz
Copy link
Member

Don't forget lambda expressions. IMO if you can address that and Model-building methods in this PR i think it will be good enough and we can follow up in subsequent PRs for the other tasks.

Based on the discussions above (and offline) I propose to refactor this solution a bit:

* Model-building `FuncOp`s + helpers `FuncOp`s will be organized under self-contained `ModuleOp` (forming the full content of the synthetic class). Helper methods will be called as `FuncCallOp`, instead of `InvokeOp`.  This configuration allows interpretation of the of model-building `FuncOp`s and so better testability (re-enabling of the `TestCodeBuilder`). It also eliminates some issues with lookup of the actual class.

* Model-building methods will return to the parent class, however they will just delegate to the same-named methods in the synthetic class.

@PaulSandoz
Copy link
Member

PaulSandoz commented Nov 14, 2025

Probably the best way to do this would be the creation of "shim" AST ClassDef nodes, which have nested shim MethodDef, as needed.

That's a neat idea. Maybe we only need one specialized JCTree node holding the module op and assuming we can limit the interaction with the other parts of tree processing. Perhaps an instance of a type extending from JCSkip?

@mcimadamore
Copy link
Collaborator

Probably the best way to do this would be the creation of "shim" AST ClassDef nodes, which have nested shim MethodDef, as needed.

That's a neat idea. Maybe we only need one specialized JCTree node holding the module op and assuming we can limit the interaction with the other parts of tree processing. Perhaps an instance of a type extending from JCSkip?

Something like that -- but it's intricate because of the way the pipeline is put together. E.g. the main entry point for code generation in JavaCompiler is:

JavaFileObject genCode(Env<AttrContext> env, JCClassDecl cdef) throws IOException { .. }

So, I think it will be hard if we don't at least create a JCClassDecl.

@PaulSandoz
Copy link
Member

PaulSandoz commented Nov 14, 2025

Something like that -- but it's intricate because of the way the pipeline is put together. E.g. the main entry point for code generation in JavaCompiler is:

JavaFileObject genCode(Env<AttrContext> env, JCClassDecl cdef) throws IOException { .. }

So, I think it will be hard if we don't at least create a JCClassDecl.

Ok. I see that all the classes (nested or otherwise) are independently placed on some queue to be processed, after AFAICT the nested ones are extracted from their parent. So adding a synthetic clas decl node to that queue holding a module op should work.

Naively, i managed to get a JCSkip node, appended to JCClassDecl.defs for the class where there are reflectable methods/lambdas, flowing though to gen, which can then be removed in a pre-processing step before normalization.

@mcimadamore
Copy link
Collaborator

Naively, i managed to get a JCSkip node, appended to JCClassDecl.defs for the class where there are reflectable methods/lambdas, flowing though to gen, which can then be removed in a pre-processing step before normalization.

There's also other things to take into account: Lower, which is a step that can also generate extra classes, has an entry point that can return List<JCTree> -- but the steps before it (including ReflectMethods) are typically one tree in, one tree out.

So, ReflectMethods can, in principle, just append a synthetic node to defs of the translated class. But at some point Lower will need to turn that into a proper class def, so that it can be returned to JavaCompiler, and codegen can happen.

So, I think that the easiest way is for ReflectMethods to create the inner class AST node, save the op there somehow. Then teach Lower to mostly ignore these nested classes (but still return them to JavaCompiler). Then code generation can take it from there...

@openjdk
Copy link

openjdk bot commented Nov 18, 2025

@asotona this pull request can not be integrated into code-reflection due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout opbuilder-optimizations
git fetch https://git.openjdk.org/babylon.git code-reflection
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge code-reflection"
git push

@openjdk openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed ready Pull request is ready to be integrated labels Nov 18, 2025
@mcimadamore
Copy link
Collaborator

To summarize the strategy that I think would be the path of least resistance to integrate with javac more properly (doesn't have to be done in this PR):

  • ReflectMethods should create "shim" inner JCClassDef nodes for the synthetic classes and append them to the defs field of the toplevel class being visited;
  • Those classes should contain no nested "defs" AST nodes, but should instead have a moduleOp field (this is a new field we can add to JCClassDef);
  • We need to teach Lower to leave these synthetic inner classes alone -- e.g. Lower will (correctly) translate them as toplevel classes (which we want), but we don't want any other treatment -- perhaps this already is the case if the class is nested static and has no members, but we should make sure;
  • Then, when in JavaCompiler we generate code, we can look at the JCClassDef: if it has no moduleOp, we send it through Gen/ClassWriter. If it does have a moduleOp, we use BytecodeGenerator instead.

This should allow us to retain flexibility to update the definitions of the synthetic classes w/o touching javac, while at the same time making sure that bytecode is generated in the right javac phase.

@openjdk openjdk bot added ready Pull request is ready to be integrated and removed merge-conflict Pull request has merge conflict with target branch labels Nov 18, 2025
@asotona
Copy link
Member Author

asotona commented Nov 18, 2025

I suggest to merge current PR, so @mabbay can follow up with part 2. of the optimization and I can independently follow up with the strategy proposed by @mcimadamore in the comment above.

@mabbay
Copy link
Member

mabbay commented Nov 19, 2025

I suggest a better name for ReflectMethods.classOps. Maybe opMethodDecls or codeModelsMethodDecls ?

@asotona
Copy link
Member Author

asotona commented Nov 19, 2025

Thank you for the reviews.
/integrate

@openjdk
Copy link

openjdk bot commented Nov 19, 2025

Going to push as commit 590b29e.

@openjdk openjdk bot added the integrated Pull request has been integrated label Nov 19, 2025
@openjdk openjdk bot closed this Nov 19, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 19, 2025
@openjdk
Copy link

openjdk bot commented Nov 19, 2025

@asotona Pushed as commit 590b29e.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

4 participants