pyg-lib 0.2.0: PyTorch 2.0 support, sampled operations, and further accelerations
pyg-lib==0.2.0
brings PyTorch 2.0 support, sampled operations and further accelerations to PyG πππ
Highlights
PyTorch 2.0 Support
pyg-lib==0.2.0
is fully compatible with PyTorch 2.0. To install for PyTorch 2.0, simply run
pip install pyg-lib -f https://data.pyg.org/whl/torch-2.0.0+${CUDA}.html
where ${CUDA}
should be replaced by either cpu
, cu117
or cu118
The following combinations are supported:
PyTorch 2.0 | cpu |
cu117 |
cu118 |
---|---|---|---|
Linux | β | β | β |
macOS | β |
Older PyTorch versions like PyTorch 1.11, 1.12 and 1.13 are still supported, and can be installed as described in our README.md
.
Sampled Operations
We added support for sampled_op
implementations (#156, #159, #160), which implements the scheme
out = left_tensor[left_index] (op) right_tensor[right_index]
efficiently without materializing intermediate representations:
from pyg_lib.ops import sampled_add
edge_index = ...
row, col = edge_index
# Replace ...
out = x[row] + x[col]
# ... with
out = sampled_add(left=x, right=x, left_index=row, right_index=col)
Supported operations are sampled_add
, sampled_sub
, sampled_mul
and sampled_div
.
Further Accelerations
index_sort
implements a (way) faster alternative to sorting one-dimensional indices compared totorch.sort()
(#181, #192). This heavily increases dataset loading times in PyG:
- Optimized
segment_matmul
andgrouped_matmul
CPU implementations via MKL BLASgemm_batch
(#146, #172):
Breaking Changes
- Temporal
neighbor_sample
andhetero_neighbor_sample
will now sample nodes with the same or smaller timestamp than the seed node (changed from only sampling nodes with a smaller timestamp) (#187)
Full Changelog
Added
- Added PyTorch 2.0 support (#214)
neighbor_sample
routines now also return information about the number of sampled nodes/edges per layer (#197)- Added
index_sort
implementation (#181, #192) - Added
triton>=2.0
support (#171) - Added
bias
term togrouped_matmul
andsegment_matmul
(#161) - Added
sampled_op
implementation (#156, #159, #160)
Changed
- Sample the nodes with the same timestamp as seed nodes (#187)
- Added
write-csv
(saves benchmark results as csv file) andlibraries
(determines which libraries will be used in benchmark) parameters (#167) - Enable benchmarking of neighbor sampler on temporal graphs (#165)
- Improved
[segment|grouped]_matmul
CPU implementation viaat::matmul_out
and MKL BLASgemm_batch
(#146, #172)
Full commit list: 0.1.0...0.2.0