[ofiwg] feature requests

Hefty, Sean sean.hefty at intel.com
Fri Jun 2 12:21:37 PDT 2017


Copying libfabric-users mailing list on this message.

Daniel, would you be able to join an ofiwg call to discuss these in more detail?  The calls are every other Tuesday from 9-10 PST, with the next call on Tuesday the 6th.

- Sean

> We work with HPC systems that deploy same but multiple network
> adapters (including Intel OmniPath and MLX infiniband adapters) on
> compute nodes.
> 
> Over time, we encountered two issues which we believe can be addressed
> by OFI library.
> 
> First, a number of MPI implementations assume homogenous SW/HW setup
> on all compute nodes.  For example, assume nodes with 2 adapters and 2
> separate networks. Some MPI implementations assume that network
> adapter A resides on CPU socket 0 on all nodes and connect to network
> 0; and network adapter B resides on CPU socket 1 and connect to
> network 1.  Unfortunately that is not always the case.  There are
> systems where some nodes use adapter A to connect to network 0 and
> others use adapter B to connect to network 0.  Same for network 1,
> where we have mixed (crossed) adapters connected to same network.  In
> such cases, MPII and lower layers cannot establish peer to peer
> connection.  The best way  to solve this is to use the network subnet
> ID to establish connection between pairs.  When there are multiple
> networks and subnetwork IDs, mpirun would specify a network ID
> (Platform MPI does this) and then the software can figure out from the
> subnet ID what adapter each node is using to connect to such network.
> Instead of implementing this logic in each MPI, it would be great if
> OFI implements this logic since it is a one stop shop over all network
> devices and providers.
> 
> Second, multirail support is a hit and miss across MPI
> implementations.  Intel Omnipath PSM2 library actually did a great job
> here by implementing multirail support at the PSM2 level. This means
> all above layers like MPI would get this functionality for free.
> Again, given that many MPI implementation can be built on top of OFI,
> It would be also great if OFI has multirail support.
> 
> Thank you
> Daniel Faraj


More information about the ofiwg mailing list