-
Notifications
You must be signed in to change notification settings - Fork 677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelizes MDAnalysis.analysis.msd
#4896
base: develop
Are you sure you want to change the base?
Conversation
Hello @tanishy7777! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2025-01-20 21:03:12 UTC |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #4896 +/- ##
===========================================
- Coverage 93.65% 93.63% -0.02%
===========================================
Files 177 189 +12
Lines 21795 22876 +1081
Branches 3067 3067
===========================================
+ Hits 20413 21421 +1008
- Misses 931 1004 +73
Partials 451 451 ☔ View full report in Codecov by Sentry. |
Thanks for your work. I'm currently quite busy, so might not be able to review in the next few days. Please be patient. |
@talagayev / @marinegor can you have a look at this PR, please? |
Checked the code and ran locally, looks all good. Here @tanishy7777 you could also add the From my side it looks good, good job @tanishy7777 :) |
Thanks a lot for reviewing my PR, will wait for the suggestions as you mentioned. |
Also could you please review this PR #4884 its pretty similar or tell me if it needs any more work to be done. Thanks again |
Hey @tanishy7777, yes the PR is similar, I can take a look at it as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking here, I just need to check the implementation IIRC there is a reason MSD algo itself is non-parallelisable, but may not apply if only the collection of particle positions is parallelised.
Like the tests were passing so I thought it has been parallized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @tanishy7777, sorry for long review -- many life things got in the way.
I think you're on a good path, I mentioned few minor things in the comments.
Main action items:
- move out hacky
@staticmethod def f(arrays): pass
out of the class - using datafiles in MDAnalysisTests (imported on top of
test_msd.py
, check that parallelized run produces exactly the sameresults
as non-parallelized one (add code snippet to comments that anyone can run to check, and its results) - add this check as a test (if it's too slow, we can always mark it this way and not run by default)
Please ask if you have questions, and ping me here if I don't reply for more than 48 hours.
@staticmethod | ||
def f(arrs): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this outside of the class and explicitly name it something like __noop
to make sure no one uses that:
def __noop(arrs) -> None: pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there are indeed concerns that the algorithm is indeed parallelizable, I would suggest you explicitly test for that: namely, ensure that all results.<something>
attributes are exactly the same, regardless of how you run your analysis. You can do this yourself first, and than later add it as one of the tests here.
I should also say that if you make a convenient way to test that, it'd be nice to have it added to other parallelization tests, to ensure additional correctness.
Fixes #4676
Changes made in this Pull Request:
split-apply-combine
technique to parallelize theMDAnalysis.analysis.msd.EinsteinMSD
testsuite/analysis/conftest.py
, analogous with existing onesclient_EinsteinMSD
, fixtures to all tests using intestsuite/MDAnalysisTests/analysis/test_msd.py
, and modified the wayrun()
method is called torun(**client_EinsteinMSD)
PR Checklist
Developers certificate of origin
📚 Documentation preview 📚: https://mdanalysis--4896.org.readthedocs.build/en/4896/