-
Notifications
You must be signed in to change notification settings - Fork 16
Memory leaks #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leaks #40
Conversation
Add logging stuff and needletail remove clap and flate2
add freeing of memory
Stop freeing cs_String before MD generation Add brief README.md for fakeminimap2
Codecov Report
@@ Coverage Diff @@
## main #40 +/- ##
==========================================
+ Coverage 49.25% 49.86% +0.61%
==========================================
Files 3 3
Lines 735 742 +7
==========================================
+ Hits 362 370 +8
+ Misses 373 372 -1
|
Hey, awesome work! I'm pretty good at Rust, but my C/C++ and FFI skills are lacking, and this is my first project touching both. I've long suspected there will be one (or more) leaks somewhere. Thanks for mappy-rs as well, I'm using medaka on a very large genome, and it dies after a few weeks with no output. I'm planning to switch out mappy with yours to get multithreading (to speed up the process and hopefully get to the point of crashing a bit quicker). I'll plan to get this tested and push out in the next day or two. Cheers :) |
Me too! We never noticed before because I was developing on machines with a couple of hundred Gigabytes of spare RAM. The |
My first (and worst) attempt was to use a transpiler to convert it to rust, but I couldn't follow it all. My current new dream is to decouple the pthreads library so that it will compile on Windows and WASM (I have a pan-genome browser I'm working on; spare-time project, but it would be cool to have mm2 built-in). |
Hey @jguhlin
Thanks again for all your work on this! We use it for https://github.com/Adoni5/mappy-rs/. The reason we wrote mappy-rs in the first place was get multi threaded performance in a long running application - namely readfish. With the difference in output on ONTs PromethION, we couldn't keep up just using mappy. Unfortunately I didn't know enough C/C++ to implement this threading as a extension to mappy, and I wanted the opportunity to practice my rust, rather than just multithread in python (perhaps a mistake 😭).
The problem
We noticed creeping memory usage when mapping repeatedly using the same

Aligner
instance. These graphs are made using memray. The test script I used was https://github.com/Adoni5/mappy-rs/blob/main/tests/memory_test.py.This was true of not running multithreaded as well - this was with a single thread.

After rooting around in PyO3, I became convinced that the problem lies in
.
minimap-rs
after running the same multi threaded file, but not actually calling out tominimap2::aligner::map
, but instead just returning a Mapping (as defined bymappy-rs
). This showed flat memory usage -You also see this leakage when using
minimappers2
in the same way.I thought the problem might lie in the threadbuffer - as discussed in this minimap2 issue - lh3/minimap2#855. This turned out to help, but we still leaked a lot of memory. The change I made to renew threadbuffer was effectively copied from https://github.com/nanoporetech/bonito/blob/b7074d7c2ae2d781db99c40ba911ee9a671206d7/bonito/aligner.py#L19.
However I eventually used valgrind to test and found leaks in the
regs
,regs.p
andmm_gen_cs
. An example (after fixing the first two)By adding calls to free here, as they do in https://github.com/lh3/minimap2/blob/ace990c381c647d6cf8fae7a4941a7b56fb67ae7/python/mappy.pyx#L212-L216, I now no longer see memory leaks!
I'll include a bit more detail after tidying up
mappy-rs
( I no longer pass my own CI 😭 ), and will make any changes to this if it fails yours.Fake minimap2
I changed this script a fair bit to test this without any messing around in
PyO3
. I think it's all fairly well commented, but if it's broken anything I'm happy to drop these changes, they were purely for testing purposes.NB I have only tested these changes on
Linux 5.19.0-45-generic #46~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jun 7 15:06:04 UTC 20 x86_64 x86_64 x86_64 GNU/Linux
. I have no Idea how/if this will run onaarch64
, or how this will play withsimd
etc.