8357551: RISC-V: support CMoveF/D #25341

Hamlin-Li · 2025-05-21T06:10:09Z

This patch enable the vectorization of statement like fd_1 bop fd_2 ? res_1 : res_2 in a loop.

The current behaviour on other platforms support vecatorization of fd_1 bop fd_2 ? res_1 : res_2 in a loop only when fd and res have the same size, but this constraint seems not necessary at least not necessary on riscv, so I relax this constraint on riscv, maybe on other platforms it can be relaxed too, but currently I only made it work on riscv.
Besides of this, I also relax the constraint on transforming Op_CMoveI/L to Op_VectorBlend on riscv, this bring some extra benefit when the res is not float or double types.
Both relaxation bring performance benefit via vectorization.

Compared with other runs (master, master with -XX:+UseVectorCmov -XX:+UseCMoveUnconditionally turned on, patch without flags turned on), average improvement introduced by the patch with -XX:+UseVectorCmov -XX:+UseCMoveUnconditionally turned on is more than 2.1 times, in some cases it can bring more than 4 times improvement.
When -XX:-UseVectorCmov -XX:-UseCMoveUnconditionally turned off, there is no regression on average.

Test

Performance

Test: o.o.b.j.l.FPComparison

Column names meanings:

p: with patch
p+v: with patch, -XX:-UseVectorCmov -XX:-UseCMoveUnconditionally turned on
m: without patch
m+v: without patch, -XX:-UseVectorCmov -XX:-UseCMoveUnconditionally turned on

Average improvement

data

Opt (m/p)	Opt (m+v/p+v)	Opt (p/p+v)	Opt (m/p+v)
1.022782609	2.198717391	2.162673913	2.199

Improvement

data

Benchmark	p	m	p+v	m+v	Opt (m/p)	Opt (m+v/p+v)	Opt (p/p+v)	Opt (m/p+v)
equalDouble	7256.157	7183.714	3377.111	7459.347	0.99	2.209	2.149	2.127
equalDoubleResDouble	7877.737	8646.54	6077.6	8691.099	1.098	1.43	1.296	1.423
equalDoubleResFloat	7181.564	8194.786	3409.252	8123.738	1.141	2.383	2.106	2.404
equalDoubleResLong	7806.422	8010.545	3335.97	7922.735	1.026	2.375	2.34	2.401
equalFloat	6802.995	6901.461	1789.033	7012.751	1.014	3.92	3.803	3.858
equalFloatResDouble	8371.707	8265.009	3431.889	8275.083	0.987	2.411	2.439	2.408
equalFloatResFloat	7148.96	8156.945	3233.043	8098.961	1.141	2.505	2.211	2.523
equalFloatResLong	7853.929	8003.017	3401.985	8097.994	1.019	2.38	2.309	2.352
greaterDouble	6941.015	6894.978	3416.193	6934.395	0.993	2.03	2.032	2.018
greaterDoubleResDouble	7882.554	7821.291	6124.731	7812.596	0.992	1.276	1.287	1.277
greaterDoubleResFloat	7358.43	7375.28	3411.382	7355.785	1.002	2.156	2.157	2.162
greaterDoubleResLong	7225.83	7165.23	3331.277	7373.934	0.992	2.214	2.169	2.151
greaterEqualDouble	6767.552	6737.533	3414.404	6720.414	0.996	1.968	1.982	1.973
greaterEqualDoubleResDouble	7255.272	8050.17	6074.58	8014.26	1.11	1.319	1.194	1.325
greaterEqualDoubleResFloat	6810.635	7588.857	3412.366	7724.462	1.114	2.264	1.996	2.224
greaterEqualDoubleResLong	7356.979	7273.975	3405.726	7202.324	0.989	2.115	2.16	2.136
greaterEqualFloat	6301.524	6250.825	1725.419	6190.227	0.992	3.588	3.652	3.623
greaterEqualFloatResDouble	7770.324	7619.463	3515.615	7652.038	0.981	2.177	2.21	2.167
greaterEqualFloatResFloat	6539.097	7433.364	3237.981	7459.479	1.137	2.304	2.019	2.296
greaterEqualFloatResLong	7282.165	7285.625	3408.542	7272.183	1	2.134	2.136	2.137
greaterFloat	6741.444	6775.978	1777.942	6609.607	1.005	3.718	3.792	3.811
greaterFloatResDouble	7376.615	7386.81	3451.468	7413.341	1.001	2.148	2.137	2.14
greaterFloatResFloat	7260.812	7227.177	3233.878	7194.408	0.995	2.225	2.245	2.235
greaterFloatResLong	7156.073	7218.269	3483.248	7395.894	1.009	2.123	2.054	2.072
isFiniteDouble	8383.339	8486.119	8520.461	8805.231	1.012	1.033	0.984	0.996
isFiniteFloat	8327.357	8469.08	8438.468	8458.09	1.017	1.002	0.987	1.004
isInfiniteDouble	8731.787	8403.307	8797.517	8559.53	0.962	0.973	0.993	0.955
isInfiniteFloat	8402.357	8311.963	8408.47	8445.983	0.989	1.004	0.999	0.989
isNanDouble	5603.906	6339.909	2708.193	5619.242	1.131	2.075	2.069	2.341
isNanFloat	6149.923	5421.851	1412.968	5415.815	0.882	3.833	4.352	3.837
lessDouble	6879.061	6891.171	3380.181	6881.82	1.002	2.036	2.035	2.039
lessDoubleResDouble	7809.712	7799.506	6116.715	7802.105	0.999	1.276	1.277	1.275
lessDoubleResFloat	7350.426	7379.593	3371.683	7349.37	1.004	2.18	2.18	2.189
lessDoubleResLong	7220.939	7160.987	3395.771	7572.061	0.992	2.23	2.126	2.109
lessEqualDouble	6782.899	6728.732	3431.742	6755.882	0.992	1.969	1.977	1.961
lessEqualDoubleResDouble	7147.814	8055.307	6075.177	7989.099	1.127	1.315	1.177	1.326
lessEqualDoubleResFloat	6915.612	7589.454	3412.457	7671.782	1.097	2.248	2.027	2.224
lessEqualDoubleResLong	7266.967	7214.049	3391.35	7222.03	0.993	2.13	2.143	2.127
lessEqualFloat	6240.432	6291.458	1768.777	6216.421	1.008	3.515	3.528	3.557
lessEqualFloatResDouble	7706.662	7725.626	3498.608	7677.536	1.002	2.194	2.203	2.208
lessEqualFloatResFloat	6592.504	7497.226	3214.976	7420.118	1.137	2.308	2.051	2.332
lessEqualFloatResLong	7256.94	7218.381	3393.99	7228.696	0.995	2.13	2.138	2.127
lessFloat	6766.048	6725.079	1733.222	6621.539	0.994	3.82	3.904	3.88
lessFloatResDouble	7397.894	7400.036	3402.64	7363.842	1	2.164	2.174	2.175
lessFloatResFloat	7242.137	7191.374	3240.398	7259.417	0.993	2.24	2.235	2.219
lessFloatResLong	7202.009	7172.072	3514.138	7357.007	0.996	2.094	2.049	2.041

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Warnings

⚠️ Patch contains a binary file (src/java.desktop/share/classes/java/awt/doc-files/BorderLayout-1.gif)
⚠️ Patch contains a binary file (src/java.desktop/share/classes/java/awt/doc-files/FlowLayout-1.gif)
⚠️ Patch contains a binary file (src/java.desktop/share/classes/java/awt/doc-files/GridBagLayout-1.gif)
⚠️ Patch contains a binary file (src/java.desktop/share/classes/java/awt/doc-files/GridBagLayout-2.gif)
⚠️ Patch contains a binary file (src/java.desktop/share/classes/java/awt/doc-files/GridLayout-1.gif)
⚠️ Patch contains a binary file (src/java.desktop/share/classes/java/awt/doc-files/GridLayout-2.gif)

Issues

JDK-8357551: RISC-V: support CMoveF/D (Enhancement - P4)
JDK-8357554: Enable vectorization of Bool -> CMove with different type size (on riscv) (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25341/head:pull/25341
$ git checkout pull/25341

Update a local copy of the PR:
$ git checkout pull/25341
$ git pull https://git.openjdk.org/jdk.git pull/25341/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25341

View PR using the GUI difftool:
$ git pr show -t 25341

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25341.diff

bridgekeeper · 2025-05-21T06:10:53Z

👋 Welcome back mli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-05-21T06:11:50Z

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

openjdk · 2025-05-21T06:12:21Z

@Hamlin-Li The following labels will be automatically applied to this pull request:

core-libs
hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

Hamlin-Li · 2025-05-22T10:14:59Z

/solves JDK-8357554

openjdk · 2025-05-22T10:16:01Z

@Hamlin-Li
Adding additional issue to solves list: 8357554: Enable vectorization of Bool -> CMove with different type size (on riscv).

Hamlin-Li · 2025-05-22T11:09:11Z

Hi @eme64 , do you mind help to have a look at the patch? Thanks!
Need to change some shared code in superword and vectornode to make it work (tracked by JDK-8357554, but addressed in this pr, I can also seperate it from this pr if it's better for you) and noticed that you have an umbrella bug (https://bugs.openjdk.org/browse/JDK-8317424) tracking the related changes.

eme64 · 2025-05-22T11:11:41Z

@Hamlin-Li The table in the PR description is a little hard to read, can you find a way to improve the formatting?

eme64

@Hamlin-Li That looks like exciting work!

I hope to come back to CMove soon myself, there are a few things to improve there!

I left some initial comments below.

Generally, splitting is nice, especially if the patch is so large.
But we probably also don't want to just add backend instructions that are not yet used. Can the VectorAPI use those CMove instructions you are about to add?

eme64 · 2025-05-22T11:16:51Z

src/hotspot/share/opto/superword.cpp

-  return type2aelembytes(use_bt) == type2aelembytes(def_bt);
+  return (type2aelembytes(use_bt) == type2aelembytes(def_bt)) ||
+         (support_vectorize_cmovefd_bool_unconditionally() && use->is_CMove() && def->is_Bool());


This will get us a merge conflict with #23413.

Also: our general approach is to ask for VectorNode::implemented and alike. Could we have some sort of check like that?

See what @jaskarth is doing in #23413:

Is that at all an option?

I think it also works. I'll change it.

eme64 · 2025-05-22T11:19:26Z

src/hotspot/share/opto/vectornode.cpp

+  case Op_CMoveI:
+    return ((SuperWord::support_vectorize_cmovefd_bool_unconditionally() && bt == T_INT) ? Op_VectorBlend : 0);
+  case Op_CMoveL:
+    return ((SuperWord::support_vectorize_cmovefd_bool_unconditionally() && bt == T_LONG) ? Op_VectorBlend : 0);


Why not always return Op_VectorBlend? And then check elsewhere if that is actually implemented for the expected types?

Let me check, sounds better if we can do so, as I don't like the current way either. : )

Hamlin-Li · 2025-05-22T12:34:36Z

@Hamlin-Li The table in the PR description is a little hard to read, can you find a way to improve the formatting?

Let me check, it's scrollable in preview mode.

Edit: I modified the data a bit, so it looks better now.

Hamlin-Li · 2025-05-22T13:35:33Z

@Hamlin-Li That looks like exciting work!

I hope to come back to CMove soon myself, there are a few things to improve there!

Great!

I left some initial comments below.

Generally, splitting is nice, especially if the patch is so large.

I can do it.

But we probably also don't want to just add backend instructions that are not yet used.

These instructs can be used by a normal a op b : r1 ? r2 statement, and TestVectorConditionalMove.java can be used to test them before the loop is vectorized, it also means on a cpu without vector instructions, it can use these CMoveF/D instructs, e.g. on a riscv machine without rvv support.

Edit: I might have missed the unsigned version and maybe P/N too. I'll check it later.

Can the VectorAPI use those CMove instructions you are about to add?

For this part, I'm not sure, but I'll have a look later.

Hamlin-Li · 2025-05-22T20:13:45Z

@eme64 I've splited the share code changes to #25336.

Hamlin-Li · 2025-05-23T14:03:05Z

Edit: I might have missed the unsigned version and maybe P/N too. I'll check it later.

I think the unsigned ones are also used by something like Integer.compareUnsigned or Long.compareUnsigned.

openjdk · 2025-06-03T14:14:22Z

@Hamlin-Li this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout cmove-fd
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

hamlin added 6 commits April 24, 2025 17:04

Initial commit

3a870c9

Merge branch 'master' into cmove-fd

42a7ec6

tests

3dddfb0

enable CMove+Bool as type_compatible_use_def

b618f54

support transfer cmove i/l to vector blend

6e859d1

disable CMoveI/L => VectorBlend

9025547

openjdk bot added hotspot [email protected] core-libs [email protected] labels May 21, 2025

hamlin added 3 commits May 21, 2025 13:13

enable

de2cae9

Merge branch 'master' into cmove-fd

bdf0e3a

enable only on riscv

d1f94dd

Hamlin-Li changed the title ~~Cmove fd~~ 8357551: RISC-V: support CMoveF/D May 22, 2025

Hamlin-Li mentioned this pull request May 22, 2025

8357554: Enable vectorization of Bool -> CMove with different type size (on riscv) #25336

Open

3 tasks

eme64 reviewed May 22, 2025

View reviewed changes

hamlin added 2 commits May 22, 2025 13:46

refactor is_velt_basic_type_compatible_use_def

9e8a7df

refactoring

dafeb5e

tests

1bec004

just for backup

60839da

openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Jun 3, 2025

8357551: RISC-V: support CMoveF/D #25341

Are you sure you want to change the base?

8357551: RISC-V: support CMoveF/D #25341

Uh oh!

Conversation

Hamlin-Li commented May 21, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test

Performance

Average improvement

Improvement

Progress

Warnings

Issues

Reviewing

Uh oh!

bridgekeeper bot commented May 21, 2025

Uh oh!

openjdk bot commented May 21, 2025

Uh oh!

openjdk bot commented May 21, 2025

Uh oh!

Hamlin-Li commented May 22, 2025

Uh oh!

openjdk bot commented May 22, 2025

Uh oh!

Hamlin-Li commented May 22, 2025

Uh oh!

eme64 commented May 22, 2025

Uh oh!

eme64 left a comment

Choose a reason for hiding this comment

Uh oh!

eme64 May 22, 2025

Choose a reason for hiding this comment

Uh oh!

Hamlin-Li May 22, 2025

Choose a reason for hiding this comment

Uh oh!

eme64 May 22, 2025

Choose a reason for hiding this comment

Uh oh!

Hamlin-Li May 22, 2025

Choose a reason for hiding this comment

Uh oh!

Hamlin-Li commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Hamlin-Li commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Hamlin-Li commented May 22, 2025

Uh oh!

Hamlin-Li commented May 23, 2025

Uh oh!

openjdk bot commented Jun 3, 2025

Uh oh!

Uh oh!

Hamlin-Li commented May 21, 2025 •

edited by openjdk bot

Loading

Hamlin-Li commented May 22, 2025 •

edited

Loading

Hamlin-Li commented May 22, 2025 •

edited

Loading