You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Port the algorithms in Conduit Blueprint Mesh (as many as make sense) (start with conduit_blueprint_mesh transforms/algorithms. Starting with to_*** for coordsets and topos is a good place to start. generate_*** methods too.)
infrastructure to control where we run at a high level - so that convert, transform, etc can have a uniform way of users asking to run on device.
Make sure CUDA and HIP are on par with one another (CUDA is likely behind)
data accessor/array bulk set/fill parallel acceleration using foralls
Performance tests for ported algorithms (tests for algorithms are going to be more oriented towards performance, since we already have lots of correctness tests. We want to test the full matrix of options for running on host/device based on source/destination data starting on host/device).
Test with both CUDA and HIP
Add node-backed tests for data accessor and data array
Performance
Incorporate Caliper for measuring performance of algorithms
Get hatchet/thicket support in to be able to compare runtimes
discover how to profile compilation
Documentation
Write documentation for our Device Support paradigm
include strawman examples in the documentation to demonstrate use
For any objects that are mixed host/device: Should be clear what is host/device and what is host only in our docs.
We should document how to ask algorithms to execute on the device
Investigations
investigate if RDC will allow us to leave implementations in source files and not move to headers as we have done in Conduit Device Support #1358 (DataArray, DataAccessor, DataType)
You can create a "parallel" exec policy that is_parallel_policy() returns false for. We should examine in the future if we want consistency with the policies.
init_device_memory_handlers() should be a static function that people only have to do once. We should have a conduit utility somewhere that should be able to do this somewhere. It should see if the handlers are installed and do it, otherwise do nothing.
someday we want allocator to make sense for nodes when we are done with them (maybe @cyrush can explain what this means)
High-Level Goals
Features
Testing
Performance
Documentation
Investigations
is_parallel_policy()returns false for. We should examine in the future if we want consistency with the policies.init_device_memory_handlers()should be a static function that people only have to do once. We should have a conduit utility somewhere that should be able to do this somewhere. It should see if the handlers are installed and do it, otherwise do nothing.General Notes & References