-
Notifications
You must be signed in to change notification settings - Fork 116
[oneTBB] Add a function to create a set of NUMA bound task arenas #650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Isaev, Ilya <[email protected]>
93dbd8a to
f819fc1
Compare
Signed-off-by: Isaev, Ilya <[email protected]>
|
@aleksei-fedotov @akukanov Please take a look. |
| .. cpp:function:: std::vector<oneapi::tbb::task_arena> create_numa_task_arenas(oneapi::tbb::task_arena::constraints other_constraints = {}, unsigned reserved_slots = 0) | ||
|
|
||
| Returns a ``std::vector`` of ``task_arena`` objects, each bound to a separate NUMA node. | ||
| The number of created ``task_arena`` is equal to the number of NUMA nodes available on the system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall it be less strict and, thus, more accurate?
| The number of created ``task_arena`` is equal to the number of NUMA nodes available on the system. | |
| The number of created ``task_arena`` is equal to the number of NUMA nodes detected on the system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we require that this is the same set of nodes that is returned by tbb::info::numa_nodes()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fair to require that as well, since the intent of this API is to provide shortcut to a common usage pattern of NUMA-constrained arenas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the description
Co-authored-by: Aleksei Fedotov <[email protected]>
Signed-off-by: Isaev, Ilya <[email protected]>
Signed-off-by: Isaev, Ilya <[email protected]>
Signed-off-by: Isaev, Ilya <[email protected]>
source/elements/oneTBB/source/task_scheduler/task_arena/task_arena_cls.rst
Show resolved
Hide resolved
source/elements/oneTBB/source/task_scheduler/task_arena/task_arena_cls.rst
Outdated
Show resolved
Hide resolved
source/elements/oneTBB/source/task_scheduler/task_arena/task_arena_cls.rst
Outdated
Show resolved
Hide resolved
source/elements/oneTBB/source/task_scheduler/task_arena/task_arena_cls.rst
Outdated
Show resolved
Hide resolved
Co-authored-by: Alexey Kukanov <[email protected]>
Signed-off-by: Isaev, Ilya <[email protected]>
Signed-off-by: Isaev, Ilya <[email protected]>
Co-authored-by: Alexey Kukanov <[email protected]>
source/elements/oneTBB/source/task_scheduler/task_arena/task_arena_cls.rst
Outdated
Show resolved
Hide resolved
source/elements/oneTBB/source/task_scheduler/task_arena/task_arena_cls.rst
Outdated
Show resolved
Hide resolved
source/elements/oneTBB/source/task_scheduler/task_arena/task_arena_cls.rst
Outdated
Show resolved
Hide resolved
| the ``task_arena`` objects. The ``numa_id`` value in ``constraints`` is automatically | ||
| replaced with a separate value for each created arena. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second sentence is not 100% clear for me.
As far as I understand, the numa_id value in constraints is ignored and the corresponding numa_id` from tbb::info::numa_nodes`` is used instead.
May be something like this will be easier to understand?
| the ``task_arena`` objects. The ``numa_id`` value in ``constraints`` is automatically | |
| replaced with a separate value for each created arena. | |
| the ``task_arena`` objects. The ``numa_id`` value specified in ``constraints`` is disregarded, and instead, | |
| the corresponding ``numa_id`` from ``tbb::info::numa_nodes()`` is used during construction | |
| of each particular ``task_arena``. |
Also, should we specify this somehow if the error occurs during the system parsing as part of create_numa_task_arenas()? Will the numa_id be ignored in this case?
I understand that if the error occurs in create_numa_task_arenas, then it is hard to imagine that there is a valid numa_id in constrains, but may be it still make sense to specify the behavior explicitly somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using another value instead of an ignored parameter and replacing that parameter with the other value is essentially the same thing. Either wording does not mandate a specific implementation. The proposed possible implementation (https://github.com/uxlfoundation/oneTBB/blob/master/rfcs/proposed/numa_support/create-numa-arenas.md#possible-implementation) uses the replacement, I would say.
I wanted to avoid words like "ignored" - especially without saying what is used instead. But I can see how this might still be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to say that the numa_id parameter for constraints will be automatically set to a certain value:
| the ``task_arena`` objects. The ``numa_id`` value in ``constraints`` is automatically | |
| replaced with a separate value for each created arena. | |
| the ``task_arena`` objects. For each created arena, the ``numa_id`` value in ``constraints`` | |
| is automatically set to the corresponding NUMA node ID from ``tbb::info::numa_nodes()``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied suggestions from here and in the comment above from the different discussion.
Co-authored-by: Konstantin Boyarinov <[email protected]> Co-authored-by: Alexey Kukanov <[email protected]>
…scovery Signed-off-by: Isaev, Ilya <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
| } | ||
| for (int i = 0; i < numa_nodes.size(); i++) { | ||
| for (int i = 0; i < arenas.size(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern demonstrates the usage, but it would be better to execute in one task_arena after enqueuing in the others. Since none of the arenas reserve a slot for a master, the main thread might wait on a fully subscribed arena, and so cannot actively participate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember, it's the specification. The examples here are intended to illustrate usage. In the developer guide, we have a better example: https://uxlfoundation.github.io/oneTBB/main/tbb_userguide/Guiding_Task_Scheduler_Execution.html#set-numa-node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not wise to include an anti-pattern anywhere. It's easy to make this mistake and then lose some parallelism as a result. But I won't insist if others think the simplicity of the example is more important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I never viewed that as an anti-pattern.
On the other hand, the example is not only for the NUMA function but a task_arena in general, so doing execute directly into one of the arenas makes some sense here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't say that this is an anti-pattern either primarily because the description of execute (I am referring to execute due to wait_for inheriting its semantics) doesn't state specifically, whether joining thread can or cannot push some worker out of arena and execute the work on its behalf. It only specifies its behavior in cases when it is possible to join or not.
But perhaps for more task_arena API coverage, we could rewrite the example to something like this:
#include "oneapi/tbb/task_group.h"
#include "oneapi/tbb/task_arena.h"
#include <vector>
int main() {
std::vector<oneapi::tbb::task_arena> arenas = oneapi::tbb::create_numa_task_arenas();
std::vector<oneapi::tbb::task_group> task_groups(arenas.size());
for (int i = 1; i < arenas.size(); i++) {
arenas[i].enqueue([]{
/* executed by a thread pinned to the specified NUMA node */
}, task_groups[i]);
}
arenas[0].execute([&task_groups] {
task_groups[0].run_and_wait([] { /* parallel work */});
});
for (int i = 1; i < arenas.size(); i++) {
arenas[i].wait_for(task_groups[i]);
}
return 0;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't actually need a task_group at all for the execute, since you will call run_and_wait. Just do the parallel work directly in the body of the execute for arena[0], right? And since no work is in arena[0] it can't be fully populated already since there was no work to attract workers until the call to execute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. I just didn't want to break that 1:1 match of task_groups to task_arenas. Otherwise the example becomes the following:
#include "oneapi/tbb/task_group.h"
#include "oneapi/tbb/task_arena.h"
#include <vector>
int main() {
std::vector<oneapi::tbb::task_arena> arenas = oneapi::tbb::create_numa_task_arenas();
std::vector<oneapi::tbb::task_group> task_groups(arenas.size()-1);
for (int i = 1; i < arenas.size(); i++) {
arenas[i].enqueue([]{
/* executed by a thread pinned to the specified NUMA node */
}, task_groups[i-1]);
}
arenas[0].execute([] {
/* parallel work */
});
for (int i = 1; i < arenas.size(); i++) {
arenas[i].wait_for(task_groups[i-1]);
}
return 0;
}Or simply still create equal number of task_group and task_arena but never use first one in example:
#include "oneapi/tbb/task_group.h"
#include "oneapi/tbb/task_arena.h"
#include <vector>
int main() {
std::vector<oneapi::tbb::task_arena> arenas = oneapi::tbb::create_numa_task_arenas();
std::vector<oneapi::tbb::task_group> task_groups(arenas.size());
for (int i = 1; i < arenas.size(); i++) {
arenas[i].enqueue([]{
/* executed by a thread pinned to the specified NUMA node */
}, task_groups[i]);
}
arenas[0].execute([] {
/* parallel work */
});
for (int i = 1; i < arenas.size(); i++) {
arenas[i].wait_for(task_groups[i]);
}
return 0;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either seems good enough to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose there's no particular reason to assume that arena[0] is the one that should be executed in. So you could create 1 less arena, enqueue into the first size()-1 and then execute in the last arena. Perhaps then the first option looks slightly better not needing to mismatch the indexes when calling enqueue. But I'd also be ok with either of the suggested examples as they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Content looks good to me. Although, I would also tweak a little representational aspects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, you can't make indentation when using .. cpp::function with line breaks, so at best it will end up like this:
std::vector<task_arena> create_numa_task_arenas(
task_arena::constraints constraints = {},
int reserved_slots = 0)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can reduce the clutter by using shorter names. For example, I find shortening of constraints function parameter to just c attractive, because:
- The
task_arena::constraintstype conveys the semantics. - The description from where the parameter is referred to mentions that these are constraints.
- The description of the function is short enough and scoped, so
ccould act similarly to constructs like loop counters. Meaning, that readers are unlikely to forget what thatcis about.
Similarly, the a_priority can be reduced to just p without loosing much of the readability.
This should not be overused though. For example, I would not contract the reserved_slots because its type does not convey the meaning, so the name is essential here.
source/elements/oneTBB/source/task_scheduler/task_arena/task_arena_cls.rst
Outdated
Show resolved
Hide resolved
Signed-off-by: Isaev, Ilya <[email protected]>
Signed-off-by: Isaev, Ilya <[email protected]>



Add a page that describes helper APIs to simplify constrained arena creations. Currently it is a single function
oneapi::tbb::create_numa_task_arenasfrom corresponding RFC.