-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[OpenMP] Enable simd in non-reduction composite constructs #146097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs. Signed-off-by: Kajetan Puchalski <[email protected]>
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-llvm Author: Kajetan Puchalski (mrkajetanp) ChangesDespite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs. Full diff: https://github.com/llvm/llvm-project/pull/146097.diff 4 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 23140f22555a5..de51ca49649f9 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2847,11 +2847,8 @@ convertOmpSimd(Operation &opInst, llvm::IRBuilderBase &builder,
llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder();
auto simdOp = cast<omp::SimdOp>(opInst);
- // TODO: Replace this with proper composite translation support.
- // Currently, simd information on composite constructs is ignored, so e.g.
- // 'do/for simd' will be treated the same as a standalone 'do/for'. This is
- // allowed by the spec, since it's equivalent to using a SIMD length of 1.
- if (simdOp.isComposite()) {
+ // TODO: Replace this once simd + reduction is properly supported
+ if (simdOp.isComposite() && simdOp.getReductionByref().has_value()) {
if (failed(convertIgnoredWrapper(simdOp, moduleTranslation)))
return failure();
diff --git a/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..9ea0ff590144f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
@@ -0,0 +1,30 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+llvm.func @test_parallel_do_simd() {
+ %0 = llvm.mlir.constant(1 : i64) : i64
+ %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+ %2 = llvm.mlir.constant(1000 : i32) : i32
+ %3 = llvm.mlir.constant(1 : i32) : i32
+ %4 = llvm.mlir.constant(1 : i64) : i64
+ omp.parallel {
+ %5 = llvm.mlir.constant(1 : i64) : i64
+ %6 = llvm.alloca %5 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr
+ %7 = llvm.mlir.constant(1 : i64) : i64
+ omp.wsloop {
+ omp.simd {
+ omp.loop_nest (%arg0) : i32 = (%3) to (%2) inclusive step (%3) {
+ llvm.store %arg0, %6 : i32, !llvm.ptr
+ omp.yield
+ }
+ } {omp.composite}
+ } {omp.composite}
+ omp.terminator
+ }
+ llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..108a99cf7c40f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
@@ -0,0 +1,33 @@
+// RUN: mlir-translate --mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_teams_distribute_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+llvm.func @test_teams_distribute_parallel_do_simd() {
+ %0 = llvm.mlir.constant(1 : i64) : i64
+ %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+ %2 = llvm.mlir.constant(1000 : i32) : i32
+ %3 = llvm.mlir.constant(1 : i32) : i32
+ %4 = llvm.mlir.constant(1 : i64) : i64
+ omp.teams {
+ omp.parallel {
+ omp.distribute {
+ omp.wsloop {
+ omp.simd private(@_QFEi_private_i32 %1 -> %arg0 : !llvm.ptr) {
+ omp.loop_nest (%arg1) : i32 = (%3) to (%2) inclusive step (%3) {
+ llvm.store %arg1, %arg0 : i32, !llvm.ptr
+ omp.yield
+ }
+ } {omp.composite}
+ } {omp.composite}
+ } {omp.composite}
+ omp.terminator
+ } {omp.composite}
+ omp.terminator
+ }
+ llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index 97608ca3b4df1..c515e6e70b264 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -26,20 +26,6 @@ llvm.func @atomic_hint(%v : !llvm.ptr, %x : !llvm.ptr, %expr : i32) {
// -----
-llvm.func @do_simd(%lb : i32, %ub : i32, %step : i32) {
- omp.wsloop {
- // expected-warning@below {{simd information on composite construct discarded}}
- omp.simd {
- omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
- omp.yield
- }
- } {omp.composite}
- } {omp.composite}
- llvm.return
-}
-
-// -----
-
llvm.func @distribute_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) {
// expected-error@below {{not yet implemented: Unhandled clause allocate in omp.distribute operation}}
// expected-error@below {{LLVM Translation failed for operation: omp.distribute}}
|
@llvm/pr-subscribers-flang-openmp Author: Kajetan Puchalski (mrkajetanp) ChangesDespite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs. Full diff: https://github.com/llvm/llvm-project/pull/146097.diff 4 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 23140f22555a5..de51ca49649f9 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2847,11 +2847,8 @@ convertOmpSimd(Operation &opInst, llvm::IRBuilderBase &builder,
llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder();
auto simdOp = cast<omp::SimdOp>(opInst);
- // TODO: Replace this with proper composite translation support.
- // Currently, simd information on composite constructs is ignored, so e.g.
- // 'do/for simd' will be treated the same as a standalone 'do/for'. This is
- // allowed by the spec, since it's equivalent to using a SIMD length of 1.
- if (simdOp.isComposite()) {
+ // TODO: Replace this once simd + reduction is properly supported
+ if (simdOp.isComposite() && simdOp.getReductionByref().has_value()) {
if (failed(convertIgnoredWrapper(simdOp, moduleTranslation)))
return failure();
diff --git a/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..9ea0ff590144f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
@@ -0,0 +1,30 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+llvm.func @test_parallel_do_simd() {
+ %0 = llvm.mlir.constant(1 : i64) : i64
+ %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+ %2 = llvm.mlir.constant(1000 : i32) : i32
+ %3 = llvm.mlir.constant(1 : i32) : i32
+ %4 = llvm.mlir.constant(1 : i64) : i64
+ omp.parallel {
+ %5 = llvm.mlir.constant(1 : i64) : i64
+ %6 = llvm.alloca %5 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr
+ %7 = llvm.mlir.constant(1 : i64) : i64
+ omp.wsloop {
+ omp.simd {
+ omp.loop_nest (%arg0) : i32 = (%3) to (%2) inclusive step (%3) {
+ llvm.store %arg0, %6 : i32, !llvm.ptr
+ omp.yield
+ }
+ } {omp.composite}
+ } {omp.composite}
+ omp.terminator
+ }
+ llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..108a99cf7c40f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
@@ -0,0 +1,33 @@
+// RUN: mlir-translate --mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_teams_distribute_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+llvm.func @test_teams_distribute_parallel_do_simd() {
+ %0 = llvm.mlir.constant(1 : i64) : i64
+ %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+ %2 = llvm.mlir.constant(1000 : i32) : i32
+ %3 = llvm.mlir.constant(1 : i32) : i32
+ %4 = llvm.mlir.constant(1 : i64) : i64
+ omp.teams {
+ omp.parallel {
+ omp.distribute {
+ omp.wsloop {
+ omp.simd private(@_QFEi_private_i32 %1 -> %arg0 : !llvm.ptr) {
+ omp.loop_nest (%arg1) : i32 = (%3) to (%2) inclusive step (%3) {
+ llvm.store %arg1, %arg0 : i32, !llvm.ptr
+ omp.yield
+ }
+ } {omp.composite}
+ } {omp.composite}
+ } {omp.composite}
+ omp.terminator
+ } {omp.composite}
+ omp.terminator
+ }
+ llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index 97608ca3b4df1..c515e6e70b264 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -26,20 +26,6 @@ llvm.func @atomic_hint(%v : !llvm.ptr, %x : !llvm.ptr, %expr : i32) {
// -----
-llvm.func @do_simd(%lb : i32, %ub : i32, %step : i32) {
- omp.wsloop {
- // expected-warning@below {{simd information on composite construct discarded}}
- omp.simd {
- omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
- omp.yield
- }
- } {omp.composite}
- } {omp.composite}
- llvm.return
-}
-
-// -----
-
llvm.func @distribute_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) {
// expected-error@below {{not yet implemented: Unhandled clause allocate in omp.distribute operation}}
// expected-error@below {{LLVM Translation failed for operation: omp.distribute}}
|
Question for reviewers who have worked on this before - the current behaviour is fairly inconsistent, that is to say, using "simd reduction" on its own results in a crash because it's not implemented, whereas using "simd reduction" as part of a composite construct simply ignores the simd. Is it desirable to keep this behaviour? For instance, the following program will crash:
While this one will only emit a warning.
There are several tests that rely on this, which is why I kept the current warning for composite constructs with reduction. Additionally, if someone knows of any scenarios which I've overlooked that this might break for, please point them out. I was expecting something else to require fixing in order to do this, but I've not found anything that breaks so far. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please could you elaborate on the testing you've done to be sure this is correctly supported.
I ran both gfortran and Fujitsu test suites (alongside the LLVM one) to make sure this does not introduce regressions in those. For a number of test Fortran programs using composite constructs with simd, I manually inspected the IR and the final binary to make sure that with this change, vectorisable instructions inside these constructs turn into vectorised instructions even with |
Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs.