Skip to content

Conduit Device Support Ongoing Development #1614

@JustinPrivitera

Description

@JustinPrivitera

High-Level Goals

  • Develop a device execution strategy for Conduit
  • Port Conduit's algorithms to be able to run on device
    • Our focus will be Blueprint transforms to start
    • Performance analysis will be crucial
  • Develop tests
    • For our execution model
    • For performance of ported algorithms
  • Write comprehensive documentation for Device Support
    • Including code comments where they make sense.

Features

  • Implement the strawman: use cases + strawman interface to raja based host device exec interface #1151 - implemented here: Conduit Device Support #1358
  • Port the algorithms in Conduit Blueprint Mesh (as many as make sense) (start with conduit_blueprint_mesh transforms/algorithms. Starting with to_*** for coordsets and topos is a good place to start. generate_*** methods too.)
  • infrastructure to control where we run at a high level - so that convert, transform, etc can have a uniform way of users asking to run on device.
  • Make sure CUDA and HIP are on par with one another (CUDA is likely behind)
  • data accessor/array bulk set/fill parallel acceleration using foralls
  • implement device sort using RAJA

Testing

  • Execution tests that cover the strawman, reducers, atomics, policy choices - Conduit Device Support #1358
  • Performance tests for ported algorithms (tests for algorithms are going to be more oriented towards performance, since we already have lots of correctness tests. We want to test the full matrix of options for running on host/device based on source/destination data starting on host/device).
  • Test with both CUDA and HIP
  • Add node-backed tests for data accessor and data array

Performance

  • Incorporate Caliper for measuring performance of algorithms
  • Get hatchet/thicket support in to be able to compare runtimes
  • discover how to profile compilation

Documentation

  • Write documentation for our Device Support paradigm
  • include strawman examples in the documentation to demonstrate use
  • For any objects that are mixed host/device: Should be clear what is host/device and what is host only in our docs.
  • We should document how to ask algorithms to execute on the device

Investigations

  • investigate if RDC will allow us to leave implementations in source files and not move to headers as we have done in Conduit Device Support #1358 (DataArray, DataAccessor, DataType)
  • You can create a "parallel" exec policy that is_parallel_policy() returns false for. We should examine in the future if we want consistency with the policies.
  • init_device_memory_handlers() should be a static function that people only have to do once. We should have a conduit utility somewhere that should be able to do this somewhere. It should see if the handlers are installed and do it, otherwise do nothing.
  • someday we want allocator to make sense for nodes when we are done with them (maybe @cyrush can explain what this means)

General Notes & References

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions