Don't scrape all of a paper's neighbours by default when adding it to the S2Graph

https://github.com/mirandrom/PyS2/blob/9bf44f0ad16d9ccdeadd4cff719eda2a8e42cc84/s2/graph/builder.py#L187

Currently, adding a paper to the S2Graph implies scraping and adding all of its neighbours as well. This choice was made to allow dynamic graph exploration (with e.g. a reinforcement learning based GraphHopper) that requires information about paper (and thus scraping it) before hopping to it. 

However, for simple rule-based GraphHoppers, this can add a lot of undesirable overhead when papers have large amounts of citations/references. I think the best solution would be to let  S2DataStore objects lazily query the API when a paper/author is not locally cached. That way if the GraphHopper doesn't need scraped paper information (e.g. if the decision is based only on the edge type), then API calls are avoided.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't scrape all of a paper's neighbours by default when adding it to the S2Graph #15

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Don't scrape all of a paper's neighbours by default when adding it to the S2Graph #15

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions