Does not support CUDA Graph Capturing / Replay for Acceleration

When trying to use CUDA Graph [[1](https://developer.nvidia.com/blog/cuda-graphs/)] [[2](https://pytorch.org/blog/accelerating-pytorch-with-cuda-graphs/)] to accelerate the UniFlowMatch, the following error is raised.

CUDA Graph can boost the speed of MAC-VO & entire autonomy stack since it can free up the CPU during network inference. This allows other less computation-intensive programs to take the time slice and run.

@Nik-V9 @infinity1096 

```
Traceback (most recent call last):
  File "/home/yutianch/AirVIO/Module/Frontend/Frontend.py", line 688, in model_inference
    static_output_flow, _, ext_info = self.solution.predict_correspondences_batch(static_input_A, static_input_B)
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yutianch/AirVIO/Module/Network/match_anything/benchmarks/ma_benchmarks/solutions/match_anything.py", line 282, in predict_correspondences_batch
    result = self.model(view1, view2)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yutianch/AirVIO/Module/Network/match_anything/match_anything/models/match_anything/variable_encoder_layer.py", line 608, in forward
    final_info_sharing_multi_view_feat, intermediate_info_sharing_multi_view_feat = self.info_sharing(
                                                                                    ^^^^^^^^^^^^^^^^^^
  File "/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/yutianch/AirVIO/Module/Network/match_anything/UniCeption/uniception/models/info_sharing/global_attention_transformer.py", line 456, in forward
    non_ref_view_pe = self.view_pos_table[non_ref_view_pe_indices].clone().detach()
                      ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: operation not permitted when stream is capturing
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Does not support CUDA Graph Capturing / Replay for Acceleration #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Does not support CUDA Graph Capturing / Replay for Acceleration #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions