Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration Issues Random Walks #190

Open
2 of 8 tasks
M-Lampert opened this issue Jul 1, 2024 · 1 comment
Open
2 of 8 tasks

Migration Issues Random Walks #190

M-Lampert opened this issue Jul 1, 2024 · 1 comment
Labels
bug Something isn't working dagstuhl documentation Improvements or additions to documentation help wanted Extra attention is needed

Comments

@M-Lampert
Copy link
Contributor

M-Lampert commented Jul 1, 2024

As far as I am aware, we reused the PathPy3 code for the implementation of the random walks. I noticed the following problems that need to be fixed:

  • Docstrings in the wrong format (fixed in Minor Fixes for Random Walks #189)
  • Bug in transition_probabilities(...) (fixed in Minor Fixes for Random Walks #189)
  • Documentation out of date. E.g. the example uses rw.plot(...) which does not work in pathpyG as far as I know.
  • Bug in HigherOrderRandomWalk.first_order_stationary_state(...): v.relations[-1] throws an exception since each node is an id (tuple or string) and not a node object as in pathpy3.
  • TODOs in process.py: More than half of the file is commented out and marked as TODO.

There are probably some more problems that I haven't noticed so far, so I propose some additional unit-tests for each file in processes/:

@M-Lampert M-Lampert added bug Something isn't working documentation Improvements or additions to documentation help wanted Extra attention is needed labels Jul 1, 2024
@M-Lampert
Copy link
Contributor Author

There is a PyTorch-Geometric-based random walk implementation: https://github.com/rusty1s/pytorch_cluster?tab=readme-ov-file#randomwalk-sampling
Current implementation:

>>> import pathpyG as pp
>>> g = pp.Graph.from_edge_list([['a','b'], ['b','c'], ['c','a'], ['c','d'], ['d','a']])
>>> rw = pp.processes.RandomWalk(g)
>>> data = rw.run_experiment(steps=100, runs=100)
>>> %timeit rw.run_experiment(steps=100, runs=100)
123 ms ± 3.87 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

PyG-based random walk:

>>> import torch
>>> from torch_cluster import random_walk

>>> seeds = torch.randint(0, g.N, (100,))
>>> row, col = g.data.edge_index
>>> %timeit random_walk(row, col, seeds, walk_length=100)
161 µs ± 6.58 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

It might make sense to use this implementation instead. Note that this implementation does not allow a restart probability but instead can be a biased random walk with p and q.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dagstuhl documentation Improvements or additions to documentation help wanted Extra attention is needed
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants