-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pyscenic grn in singularity: workers continuously restarting after memory warnings #54
Comments
pySCENIC is quite memory hungry. Try to run with less threads on your 22GiB RAM machine. The warning comes from If you run |
Even with only 1 thread I get following output:
|
Hello, I got the same warning when running pyscenic grn function (version: 0.9.17): |
Dear Liang, pySCENIC is known to be quite memory hungry. Does the process block or are you able to get results from the grn step? Kindest regards. |
Hi all, I have a similar problem running pySCENIC through docker on a 64G machine (on AWS).
I checked the memory usage but it seems to be constantly around 25% max. I am using 8 workers on an 8-core machine.
This process has been running for several days on a large count matrix (55k cells, 20k genes). I used sparse matrix support from pySCENIC to reduce the memory usage.
Thanks, |
Hi @fbrundu , What version of pySCENIC are you using? I would recommend updating to 0.11.0 if you haven't already. If you have problems with the Dask implementation, I would suggest trying the multiprocessing script (see here). From your logs it might help, unless updating to 0.11.0 fixes this. But I think maybe the larger problem is that running 55k x 20k expression matrix is simply going to take a very long time with 8 threads. It's hard to say exactly but I would guess something like 3-6 days. You can check the dask dashboard for some kind of progress indicator (see here for an example, but I haven't tried to connect to a running instance in Docker. It might be tricky to access the correct port, usually 8787). If you use the multiprocessing script, it will print the progress right to the terminal. |
Thanks @cflerin. I got the pySCENIC docker image version 0.11.0 (from https://hub.docker.com/r/aertslab/pyscenic). No ports seem to be exposed by the docker container (if I read it correctly).
I also checked with netstat on the AWS instance with similar results. I'll wait for the 10-day mark, I think, then try the multiprocessing implementation. Thanks |
Hi @cflerin I was able to connect to the Dask dashboard at the 10-day mark: How should I read this plot? From the docs at dask I think this indicates that the progress should be on average 1/2 - 2/3. But I am not sure which of the tasks takes more time to be executed. Am I correct? Should I expect 15-20 days of runtime? Thanks UpdateI think the bottleneck might be the task "infer_partial_network-from-delayed", if I understood correctly the results from the dashboard. Tasks before that ("csc_matrix" and "from-delayed") have several results in-memory (darker color) while the infer step seems to be the limiting task (right now only 10 results in memory and 9393 / 21668 in total). Therefore, a more appropriate estimate could be ~23 runtime days.. AddendumFor anyone trying to access the dask interface on a docker container and/or on an AWS running instance, this is a bit convoluted, so I add here some ideas on how I managed to do it. This is just a fast workaround, by no means the best/most secure solution, so use at your own risk. First get the IP address of the docker container within the AWS instance.
Check port of web interface within container (usually 8787), and install/start ssh daemon
Allow root login on docker with ssh:
On a separate shell on the AWS instance, map instance's port 8787 to docker port 8787
Change security group on AWS console to allow connection from your local IP to the instance at port 8787 (Inbound rules). On a local shell, map local port 8787 to AWS instance's port 8787 (which is already mapped to docker container's port 8787):
Now you can access the dashboard at your local address |
Hi @fbrundu , Thanks for the detailed updates on running this in AWS and port mapping! It seems like enabling root access here could be a security issue, but in your use case on AWS it's probably fine -- perhaps I should expose port 8787 in the docker container going forward. On the progress plot, I think you should go by the |
Hi @cflerin, In the end, the process ran for 13 days (probably there was a latency component at the beginning). |
Few additional considerations: I reran the grn process on a bigger machine (16 cores and 128G vs 8 cores and 64G) - BUT I defined 8 as the number of cores available to pySCENIC. I removed the It is possible that the sparse configuration doesn't work properly in some circumstances (and runtime increases non-linearly with the number of genes). Or, the procedure requires more cores than declared when instantiating the docker container (therefore, there should be some additional free cores to execute it correctly). Also, I think that the warning of workers restarting might be a good reason to stop the process earlier since probably they will make the entire procedure fail eventually (in my experience). |
Hi @fbrundu , Sorry to hear that your run failed after 13 days! That's super frustrating... It's strange that there was no error. But glad to hear that you finally got it finished. Thanks for the feedback on the sparse usage as well. I'm not sure exactly why this is the case but it's helpful to know this at least. |
- Expose 8787 for dask dashboard (#54) - Update scanpy
I'm going to close this since the original issue was from an older version, and I think this is a problem with Arboreto specifically (aertslab/arboreto/issues/15). |
Hi,
I am trying to use pySCENIC in a singularity container (version 0.9.5) but I keep getting warnings that eventualy result in errors and workers continuously restarting (see below).
The system I get these errors on has 22gb ram and I used 4 workers.
When I run the exact same thing (same singularity image, same inputs, same command) on a machine with more ram (64gb), everything works perfectly.
During the warnings, I checked the memory usage and it never went over 14gb.
Is there something I can do to make this also work on the machine with 22gb ram? Or is this problem caused by the 'limited' amount of ram available?
This keeps going until I manualy stop the process.
The text was updated successfully, but these errors were encountered: