- 
                Notifications
    
You must be signed in to change notification settings  - Fork 929
 
WeeklyTelcon_20180122
        Geoffrey Paulsen edited this page Jan 15, 2019 
        ·
        1 revision
      
    - Dialup Info: (Do not post to public mailing list or public wiki)
 
- Geoff Paulsen
 - Jeff Squyres
 - akvenkatesh
 - Artem
 - Brian
 - Edgar Gabriel
 - Geoffroy Vallee
 - Howard
 - Josh Ladd
 - Josh Hursey
 - Matthew Dosanjh
 - Mohan
 - Todd Kordenbrock
 - Nathan
 
- News: Ralph will not be able to work on Open MPI anymore. He will continue to work on PMIx, but not even the Open MPI PMIx merge.
 - Mellanox will step up and help with PMIx and ORTE integration issues.
 - IBM can help with bugfixing, but can not own orte.
 - Need a v3.1 release engineer to help Brian will send email to devel-core
 - Ralph offered to have a brain dump day. Email Brian if interested.
 - MPI forum is in Portland in over a month.
 - Face2Face -
- Brian will email to see about co-locating Open MPI with PMIx with ORTE.
 - if it's not an issue, then resolve next week.
 
 
Review All Open Blockers
Review v2.x Milestones v2.1.3
- No chance to look at.
 - Pretty quiet, ready to go
 
Review v3.0.x Milestones v3.0.1
- Schedule:  RC2 is actively building now.  [50%]
- On 3.x series trying to cut RCs on nightly tarballs.
 - Didn't get RC last week
 - Will get RC today.
 
 - Blocker on v3.1.x
- PR4516
 - May not be a blocker.
 
 - Target v3.0.x in PR4715
- Review required.
 
 - Will Pull in PR4716
- 
Issue 4563
- not seeing on little arm boxes here, Jenkins uses --disable-builtin-atomics.
 
 
 - 
Issue 4563
 - Comm Spawn - Documentation PR ready or pulled
 - 
Issue 4509
- We believe this is closed. Asked Nathan to close.
 
 - Issue - hwloc can't handle cuda from a different location
- On Master specifically disabling hwloc cuda.
 - External component does NOT disable build, since
 
 - 4677 - hwloc2 WIP Cant get to until the Weekend.
 
Review v3.1.x Milestones v3.1.0
- SCHEDULE:
- RC2 Early next week.
- Would like https://github.com/open-mpi/ompi/issue/4605 in there.
 
 
 - RC2 Early next week.
 - 
BLOCKER:
- OSC monitoring fix (doesn't build with Portals 4)
- PR4523
 - waiting review.
 
 - PMIx 2.1 PR4605
- PR4746
 - Ralph - there is cleanup issue with PMIx 2.1, but we have cleanup issues today
 - Mellanox will help work on this.
 
 - UCX one sided violating PR4688
 - 
Issue 4303
- Probably just need to build a patch.
 
 
 - OSC monitoring fix (doesn't build with Portals 4)
 
Review Master Master Pull Requests
- Issue Issue4686
- Jeff Tried to reproduce and failed.
 - Thought HCOLL was an issue, Artem took out, and put back.
 - Something going on in there. Possibly atomic related.
 - Might need Nathan's attention.
 - Someone could try reverting the one change to atomics to see if that caused it.
 - Mellanox will try to reproduce after reverting atomic change. Timing issue.
 
 - Dynamic operations, a TON of sigfaults.  All in opal_progress, during ompi_sync_wait multi-credit.
- Something is wrong with atomics. Intercomm_create or Spawn.
 - Cisco is tickling the most, and will look at.
 - Delayed.
 
 - PR4697 Got resolved and merged to master. * Opal Progress change looks good for most interconnects. * TCP performance regression was resolved and merged to master. * Going to PR this into v3.1.x * George is unhappy with this * Don't have any non-OS wrappers for TLS * Master now checks for Cx11 Can we make it default? * Mac Sierra may/maynot even with _Thread_local * Would be nice if we could require Cx11 for v4.0
 - Reg-ex expression creation.
- PR4710
 - someone created a test and put it in make-check rather than MTT.
 - Then made the component static so that don't have to do make install
 - Dont think we should be adding tests to make-check
 - Question - Is there a Regex library we could use? Reg-ex is hard.
 - This is working pretty well, but did add Framework to allow for future components.
 
 
- Change behavior of opal_check_package
- Brian will send email to devel
 - Make it more explicit when it finds issues
 - Issue Issue4423
 
 - When your PR has been accepted into a release branch, please go to the issue, and remove the target of the release branch that it was just merged into. Attempting to automate this in the future.
 
- New Topic - We currently can't write unit tests against components.
- Some way to say "this unit test is against this component".
 - Intel went through and did this internally for orte.  Already hosted in public domain.
- Ralph will send link to Brian to take a look.
 
 
 - Python Client can't report back to database.
- https://github.com/open-mpi/mtt/issues/614
 - Josh Hursey will look at.
 
 
Review Master MTT testing
- Probably looking at March or early April
- San Jose or Dallas
- Geoff will send out two Doodles for date and time.
 
 
 - San Jose or Dallas
 
- Discuss abandoning openib btl.
- LNLL - is no longer paying anyone to maintain openib btl.
- Nathan has a UCX BTL
 
 - ETA on GPU in UCX - basic minus CUDA IPC in test now.
 - Any warning message if on iWarp
 - What's the roadmap for this? 3.x or 4.x?
 
 - LNLL - is no longer paying anyone to maintain openib btl.
 
- pushed date to late feb or march.
 
- Mellanox, Sandia, Intel
 - LANL, Houston, IBM, Fujitsu
 - Amazon,
 - Cisco, ORNL, UTK, NVIDIA