Skip to content

Don't scrape all of a paper's neighbours by default when adding it to the S2Graph #15

@mirandrom

Description

@mirandrom

if self.hopper.hop(gpath, self.graph):

Currently, adding a paper to the S2Graph implies scraping and adding all of its neighbours as well. This choice was made to allow dynamic graph exploration (with e.g. a reinforcement learning based GraphHopper) that requires information about paper (and thus scraping it) before hopping to it.

However, for simple rule-based GraphHoppers, this can add a lot of undesirable overhead when papers have large amounts of citations/references. I think the best solution would be to let S2DataStore objects lazily query the API when a paper/author is not locally cached. That way if the GraphHopper doesn't need scraped paper information (e.g. if the decision is based only on the edge type), then API calls are avoided.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions