Skip to content

Commit b3a6144

Browse files
committed
tiny display fixes [ci skip]
1 parent 92fe327 commit b3a6144

File tree

1 file changed

+57
-45
lines changed

1 file changed

+57
-45
lines changed

docs/modules/kokkos/pages/advanced-concepts/advanced-reductions.adoc

Lines changed: 57 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
== Introduction
44

55
[.text-justify]
6-
In Kokkos C++, a reduction is a parallel operation that combines the results of individual calculations into a single final value. [1][2] This mechanism, primarily implemented through the Kokkos::parallel_reduce function, offers a powerful paradigm for consolidating data distributed across different processing units. The concept of a "Reducer" in Kokkos encapsulates the logic of combining intermediate values, defining not only the merging operation but also the initialization of thread-private variables and the localization of the final result.
6+
In Kokkos C++, a reduction is a parallel operation that combines the results of individual calculations into a single final value. [1][2] This mechanism, primarily implemented through the `Kokkos::parallel_reduce` function, offers a powerful paradigm for consolidating data distributed across different processing units. The concept of a "Reducer" in Kokkos encapsulates the logic of combining intermediate values, defining not only the merging operation but also the initialization of thread-private variables and the localization of the final result.
77

88
[.text-justify]
99
Kokkos allows for multiple reductions to be performed within a single kernel, which can significantly reduce kernel launch overhead and improve overall performance. It also offers the ability to use Views as reduction targets, enabling asynchronous reduction operations. This capability is particularly valuable in scenarios where the reduction result is needed for further computation or when overlapping computation and communication.
@@ -24,74 +24,86 @@ Kokkos offers various built-in reducers for common operations:
2424
** `Kokkos::Prod` for product
2525
** `Kokkos::Min` and `Kokkos::Max` for minimum and maximum
2626

27-
Example:
28-
27+
.Sum with Kokkos
28+
[%collapsible.proof]
29+
====
2930
[source,c++]
3031
----
31-
double result;
32-
Kokkos::parallel_reduce("Sum", policy,
33-
KOKKOS_LAMBDA (const int i, double& lsum) {
34-
lsum += data[i];
35-
}, Kokkos::Sum<double>(result));
32+
double result;
33+
Kokkos::parallel_reduce("Sum", policy,
34+
KOKKOS_LAMBDA (const int i, double& lsum) {
35+
lsum += data[i];
36+
}, Kokkos::Sum<double>(result));
3637
----
37-
38+
====
3839
** Multiple Reductions in One Kernel
3940

4041
Kokkos allows performing multiple reductions simultaneously:
4142

43+
.Multiple Reductions
44+
[%collapsible.proof]
45+
====
4246
[source,c++]
4347
----
44-
struct MultipleResults {
45-
double sum;
46-
int max;
47-
};
48-
49-
MultipleResults results;
50-
Kokkos::parallel_reduce("MultiReduce", policy,
51-
KOKKOS_LAMBDA (const int i, MultipleResults& lresults) {
52-
lresults.sum += data[i];
53-
if (data[i] > lresults.max) lresults.max = data[i];
54-
},
55-
Kokkos::Sum<MultipleResults>(results));
48+
struct MultipleResults {
49+
double sum;
50+
int max;
51+
};
52+
53+
MultipleResults results;
54+
Kokkos::parallel_reduce("MultiReduce", policy,
55+
KOKKOS_LAMBDA (const int i, MultipleResults& lresults) {
56+
lresults.sum += data[i];
57+
if (data[i] > lresults.max) lresults.max = data[i];
58+
},
59+
Kokkos::Sum<MultipleResults>(results));
5660
----
61+
====
5762

58-
** Using Kokkos::View as Result for Asynchronicity
63+
** Using `Kokkos::View` as Result for Asynchronicity
5964

60-
For asynchronous operations, you can use `Kokkos::View` as the reduction target:
65+
For asynchronous operations, you can use xref:basic-concepts/views.adoc[Views] as the reduction target:
6166

67+
.Async Reduction
68+
[%collapsible.proof]
69+
====
6270
[source,c++]
6371
----
64-
Kokkos::View<double*> result("Result", 1);
65-
Kokkos::parallel_reduce("AsyncReduce", policy,
66-
KOKKOS_LAMBDA (const int i, double& lsum) {
67-
lsum += data[i];
68-
}, Kokkos::Sum<double>(result(0)));
72+
Kokkos::View<double*> result("Result", 1);
73+
Kokkos::parallel_reduce("AsyncReduce", policy,
74+
KOKKOS_LAMBDA (const int i, double& lsum) {
75+
lsum += data[i];
76+
}, Kokkos::Sum<double>(result(0)));
6977
----
78+
====
7079

7180
This allows the reduction to be performed asynchronously, with the result available in the view.
7281

73-
** Custom Reductions
74-
82+
** Custom Reductions:
7583
Kokkos supports custom reduction operations:
7684

85+
.Custom Reduction
86+
[%collapsible.proof]
87+
====
7788
[source,c++]
7889
----
79-
struct CustomReducer {
80-
typedef double value_type;
81-
KOKKOS_INLINE_FUNCTION void join(value_type& dest, const value_type& src) const {
82-
dest = (dest > src) ? dest : src; // Custom max operation
83-
}
84-
KOKKOS_INLINE_FUNCTION void init(value_type& val) const {
85-
val = std::numeric_limits<double>::lowest();
86-
}
87-
};
88-
89-
double result;
90-
Kokkos::parallel_reduce("CustomReduce", policy,
91-
KOKKOS_LAMBDA (const int i, double& lval) {
92-
lval = (lval > data[i]) ? lval : data[i];
93-
}, CustomReducer());
90+
struct CustomReducer {
91+
typedef double value_type;
92+
KOKKOS_INLINE_FUNCTION void join(value_type& dest, const value_type& src) const {
93+
dest = (dest > src) ? dest : src; // Custom max operation
94+
}
95+
KOKKOS_INLINE_FUNCTION void init(value_type& val) const {
96+
val = std::numeric_limits<double>::lowest();
97+
}
98+
};
99+
100+
double result;
101+
Kokkos::parallel_reduce("CustomReduce", policy,
102+
KOKKOS_LAMBDA (const int i, double& lval) {
103+
lval = (lval > data[i]) ? lval : data[i];
104+
}, CustomReducer());
94105
----
106+
====
95107

96108

97109
== References

0 commit comments

Comments
 (0)