You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using the Link Neighborhood Loader on a large multigraph dataset with 31M edges and 2M nodes. Since the graph contains parallel edges, I assign unique IDs to each edge to track which edges are sampled. This is done as follows:
However, I noticed that during sampling, the same edge can appear multiple times in the sampled batch. This issue did not occur with smaller graphs but seems to arise with this large dataset.
Is there a way to ensure that sampled batches do not contain duplicate edges? Or is this behavior expected when working with large multigraphs?
🐛 Describe the bug
Hi,
I am using the Link Neighborhood Loader on a large multigraph dataset with 31M edges and 2M nodes. Since the graph contains parallel edges, I assign unique IDs to each edge to track which edges are sampled. This is done as follows:
data['node', 'to', 'node'].edge_attr = torch.cat( [torch.arange(data['node', 'to', 'node'].edge_attr.shape[0]).view(-1, 1), data['node', 'to', 'node'].edge_attr], dim=1 )
However, I noticed that during sampling, the same edge can appear multiple times in the sampled batch. This issue did not occur with smaller graphs but seems to arise with this large dataset.
Is there a way to ensure that sampled batches do not contain duplicate edges? Or is this behavior expected when working with large multigraphs?
replace parameter is set to False.
Versions
The text was updated successfully, but these errors were encountered: