Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAM issues when extracting features #27

Open
arthurcgusmao opened this issue May 6, 2018 · 1 comment
Open

RAM issues when extracting features #27

arthurcgusmao opened this issue May 6, 2018 · 1 comment

Comments

@arthurcgusmao
Copy link

arthurcgusmao commented May 6, 2018

Hi,

I am running into RAM issues when performing the CreateMatrices operation for larger graphs and for more expressive features (i.e., going beyond PRA-like features) using SFE.

I observed that, from the time I start running the code to its end, RAM usage only tends to increase, independently of the current relation being processed. Interestingly, if I quit the execution and restart the code a second time from the last relation that the first run couldn't handle, the code is able to follow through with a number of new relations before again using SWAP space (and then, since everything slows down heavily I am forced to quit and restart a third time, and so on and so forth). I am able to process all relations in this manner, but it is not ideal.

Thus, I am trying to understand the mechanism responsible for that. In my (very humble) understanding, this could be due to the following reasons:

  1. Regarding the Operation instance: Every time the code processes each relation, a new FeatureGenerator instance is created, where features are stored. Maybe this generator is not being cleaned from memory after each relation is run and this is what causes the RAM usage to explode as more and more relations are processed.
  2. Another object (maybe split or graph ?) increases its size as more and more features are extracted. Maybe each subgraph created for each node pair is not being cleaned from memory, or something along these lines.

Notice that the reasons mentioned above are just speculations so far since I am still getting familiar with the code, but they seem to make sense given its behavior.

Any ideas on this will help.

@matt-gardner
Copy link
Owner

Yeah, this sure sounds like something is not getting garbage collected, but your guess is as good as mine here, unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants