[ofiwg] feature requests
Hefty, Sean
sean.hefty at intel.com
Fri Jun 2 12:21:37 PDT 2017
Copying libfabric-users mailing list on this message.
Daniel, would you be able to join an ofiwg call to discuss these in more detail? The calls are every other Tuesday from 9-10 PST, with the next call on Tuesday the 6th.
- Sean
> We work with HPC systems that deploy same but multiple network
> adapters (including Intel OmniPath and MLX infiniband adapters) on
> compute nodes.
>
> Over time, we encountered two issues which we believe can be addressed
> by OFI library.
>
> First, a number of MPI implementations assume homogenous SW/HW setup
> on all compute nodes. For example, assume nodes with 2 adapters and 2
> separate networks. Some MPI implementations assume that network
> adapter A resides on CPU socket 0 on all nodes and connect to network
> 0; and network adapter B resides on CPU socket 1 and connect to
> network 1. Unfortunately that is not always the case. There are
> systems where some nodes use adapter A to connect to network 0 and
> others use adapter B to connect to network 0. Same for network 1,
> where we have mixed (crossed) adapters connected to same network. In
> such cases, MPII and lower layers cannot establish peer to peer
> connection. The best way to solve this is to use the network subnet
> ID to establish connection between pairs. When there are multiple
> networks and subnetwork IDs, mpirun would specify a network ID
> (Platform MPI does this) and then the software can figure out from the
> subnet ID what adapter each node is using to connect to such network.
> Instead of implementing this logic in each MPI, it would be great if
> OFI implements this logic since it is a one stop shop over all network
> devices and providers.
>
> Second, multirail support is a hit and miss across MPI
> implementations. Intel Omnipath PSM2 library actually did a great job
> here by implementing multirail support at the PSM2 level. This means
> all above layers like MPI would get this functionality for free.
> Again, given that many MPI implementation can be built on top of OFI,
> It would be also great if OFI has multirail support.
>
> Thank you
> Daniel Faraj
More information about the ofiwg
mailing list