You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Avoid large device allocation in UMAP with nndescent (#6292)
Currently `NNDescent` returns two arrays:
- `graph.graph()`: (n x graph_degree) on host
- `graph.distances()`: (n x graph_degree) on device
Downstream, the rest of UMAP wants both of these to be device arrays of shape (n x n_neighbors).
Currently we copy `graph.graph()` to a temporary device array, then slice and and copy it to the output array `out.knn_indices`.
Ideally we'd force `graph_degree = n_neighbors` to avoid the slicing entirely (and reduce the size of the intermediate results). However, it seems like currently there's a bug in `NNDescent` where reducing `graph_degree` to `n_neighbors` causes a significant decrease in result quality. So for now we need to keep the slicing around.
We can avoid allocating the temporary device array though, instead doing the slicing on host. Doing this avoids allocating a (n x graph_degree) device array entirely; for large `n` this can be a significant savings (47 GiB on one test problem I was trying).
We still should fix the `graph_degree` issue, but for now this should help unblock running UMAP on very large datasets.
Authors:
- Jim Crist-Harif (https://github.com/jcrist)
Approvers:
- Divye Gala (https://github.com/divyegala)
- William Hicks (https://github.com/wphicks)
URL: #6292
0 commit comments