- 
                Notifications
    
You must be signed in to change notification settings  - Fork 929
 
WeeklyTelcon_20210323
        Geoffrey Paulsen edited this page Mar 24, 2021 
        ·
        1 revision
      
    - Dialup Info: (Do not post to public mailing list or public wiki)
 
- Austen Lauria (IBM)
 - Brendan Cunningham (Cornelis Networks)
 - Brian Barrett (AWS)
 - Edgar Gabriel (UH)
 - Geoffrey Paulsen (IBM)
 - Harumi Kuno (HPE)
 - Hessam Mirsadeghi (UCX/nVidia)
 - Howard Pritchard (LANL)
 - Jeff Squyres (Cisco)
 - Josh Hursey (IBM)
 - Michael Heinz (Cornelis Networks)
 - Naughton III, Thomas (ORNL)
 - Raghu Raja (AWS)
 - Ralph Castain (Intel)
 - Todd Kordenbrock (Sandia)
 - Tomislav Janjusic
 - William Zhang (AWS)
 - Marisa Roman (Cornelius)
 - Matthew Dosanjh (Sandia)
 
- Akshay Venkatesh (NVIDIA)
 - Artem Polyakov (nVidia/Mellanox)
 - Aurelien Bouteiller (UTK)
 - Brandon Yates (Intel)
 - Charles Shereda (LLNL)
 - Christoph Niethammer (HLRS)
 - David Bernhold (ORNL)
 - Erik Zeiske
 - Geoffroy Vallee (ARM)
 - George Bosilca (UTK)
 - Joseph Schuchart
 - Joshua Ladd (nVidia/Mellanox)
 - Mark Allen (IBM)
 - Matias Cabral (Intel)
 - Nathan Hjelm (Google)
 - Noah Evans (Sandia)
 - Scott Breyer (Sandia?)
 - Shintaro iwasaki
 - Xin Zhao (nVidia/Mellanox)
 
- If you don't have zlib, this affects launching and memory consumption
- Tools will spit out a warning that you don't have compression
 - We need to write up something for Packagers as well.
 
 - Brian will document this (really should build with zlib) in a README-packagers.md
- Hope that packager will package these things externally.
 
 - NEWS bullets for zlib as well.
- Geoff will do this.
 
 - Please update your CI to run MTT on v5.0.x PRs, and on v5.0.x based PRs
 - Please Cherry-pick your bugfix/v5.0.x PRs there after your PR is accepted to master
 
- Doing formatting on master and v5.0.x seems reasonable
 - But reformatting v4.0.x and v4.1.x seems too risky.
 - clang-format instructions are in the format file.
 - He also ran clang-tidy, and we don't have directions for that yet.
 - Requires clang-format at least v10  (Different version clang-format than clang compiler)
- Nathan will try to make it compatible with older v8
 - Geoff ping Nathan to request the v5.0.x version of opal PR.
 
 - clang-format is separate from compiler-toolchange
 - Will we require developers to REQUIRE this?
- Not requiring a github build to require it.
 - Will have a CI test that will check it.
 - Not in a path where every CI will have to have it installed.
 
 - Do we want to hold off on MORE before v5.0.0 ships? (or 6 months after?)
 - Should be rerun as a non-cherry-pick.  Might be easy to lose
- But the two branches are close.
 
 - Run it on master, try to PR to v5.0.x, and
 - Nathan can only run certain sections of the code-base with the systems he has.
- Strongly encourage everyone test their sections.
 - PSM2 - doesn't even build in our CI, so someone should build/test this.
 
 
- Needs a squash, missing signed off commit.
- Austen will ping Nathan.
 - want in v5.0.x also
 
 
- This is working just fine at the moment, except for ROMIO.
- ROMIO is throwing tons of warnings. But okay.
 - Would need to fix it upstream.
 
 - PMIx/PRRTE is updated.
 - Perhaps now for 3rdParties, configure with --silence-obsolencense flag.
 - Does someone want to ping Rob about it?
- Jeff will
 
 
- Intercomm Merge tests are timing out.
- MTT master on HLS timeouts
 
 
- Failure in prrte on v5.0.x, will be resolved in tonight's.
 - https://github.com/open-mpi/ompi/issues/8566
 - Using an actual 32bit gcc - Compile fail
 - Nathan thinks he might be able to write a compare-and-swap
 - v5.0 - good time to drop 32bit.
- Jeff will send note to packaging, and see if they will care.
 - Debian is okay, they will just use MPICH
 - OSC/RDMA assumed everything was 64bit, but once we changed
 
 - On 32bit, if we could use C11 atomics with locks, it might be allowed.
- So perhaps this would be a path.
 - Is C11 available on older 32bit systems.
 - gcc 6.0+ it should work fine.
 
 - Nobody has a strong opinon.
- Pride issue, but it's also time and money
 - Right now the only thing breaking it Nathan's 1sided.
 - Lets ask Nathan what he thinks, and if he has time to fix it.
 
 
- Shoot for a next RC of v4.0.6 on March 31st
 - blocking on UCX issues (see New topics above)
- George, will get to it soon.
 
 - Too many Open Issues (50)
- Geoff and Howard will go over v4.0.x issues, and try to close or address many of them.
- May need to label some as wont_fix, and then close
 
 
 - Geoff and Howard will go over v4.0.x issues, and try to close or address many of them.
 - Check status of ROMIO from MPICH vs in v4.1 vs v4.0.x
 
- Same boat, waiting for George's datatype fix.
 - A new v4.1 RC was built last week
 - Most of ROMIO fixes have gone into MPICH
- 8371 - might be close
 
 - Intercomm Merge issue
- may have gone away after PRRTE update on master
 - Investigating
 
 - blocking on UCX issues (see New topics above)
- George, will get to soon.
 
 
- What do we do with the mpirun Manpage?
- Didn't want OMPI requiring Sphynx, but if PRRTE and PMIx in same tar
 
 - Ralph almost has singleton comm spawn working
- Single node without the mpirun process
 
 - Static MCA components default still on track for v5.0.x
 
- ECP Community days ( March 30-April 1st )
- Need SLIDES by close of business FRIDAY (not Saturday)
 - Each day 90 minute time slots.
 - Tuesday March 30th from 1-2:30pm (US Eastern)
- LIVE
 - Invited some people to speak. They will be our main community speakers.
 - Anyone on OMPI community can send slides to Jeff and George
 - Due Friday March 26th
 
 - PMIx Wed 31st 11 - 12:30 (US Eastern)
 - Need to ensure no more MPIR, SLURM PMI1/2,
 
 
- PR 8329 - convert README, HACKING, and possibly Manpages to restructured text.
- Uses https://www.sphinx-doc.org/en/master/ (Python tool, can pip install)
 - Intent this is for v5.0
- mpirun / prrterun - we had quite a bit of details in orte, but are updating as much as possible.
 
 - Ralph has asked about this for PMIx/PRRTE since this is turning out to work
 
 - No update - 3/16
- Could be independent of PMIx and PRRTE.
 - PMIx and PRRTE want to follow suite, and not require both pandoc and sphynx.
 
 
- OLD
 - What do we want to do about ROMIO in general.
- OMPIO is the default everywhere.
 - Giles is saying the changes we made are integration changes.
- There have been some OMPI specific changes put into ROMIO, meaning upstream maintainers refuse to help us with it.
 - We may be able to work with upstream to make a clear API between the two.
 
 - As a 3rd party package, should we move it upto the 3rd party packaging area, to be clear that we shouldn't make changes to this area?
 
 - Need to look at this treematch thing. Upstream package that is now inside of Open-MPI.
 - Might want a CI bot to watch a set of files, and flag PRs that violate principles like this.
 
How's the state of https://github.com/open-mpi/ompi-tests-public/
- Putting new tests there
 - ULFM have some tests added there.
 - Need folks to add to MTT
 - Should have some new Sessions tests