[ofiwg] feature requests

Faraj, Daniel daniel.faraj at hpe.com
Wed May 31 09:45:11 PDT 2017


OFI group;

We work with HPC systems that deploy same but multiple network adapters (including Intel OmniPath and MLX infiniband adapters) on compute nodes.
Over time, we encountered two issues which we believe can be addressed by OFI library.

First, a number of MPI implementations assume homogenous SW/HW setup on all compute nodes.  For example, assume nodes with 2 adapters and 2 separate networks. Some MPI implementations assume that network adapter A resides on CPU socket 0 on all nodes and connect to network 0; and network adapter B resides on CPU socket 1 and connect to network 1.  Unfortunately that is not always the case.  There are systems where some nodes use adapter A to connect to network 0 and others use adapter B to connect to network 0.  Same for network 1, where we have mixed (crossed) adapters connected to same network.  In such cases, MPII and lower layers cannot establish peer to peer connection.  The best way  to solve this is to use the network subnet ID to establish connection between pairs.  When there are multiple networks and subnetwork IDs, mpirun would specify a network ID (Platform MPI does this) and then the software can figure out from the subnet ID what adapter each node is using to connect to such network.  Instead of implementing this logic in each MPI, it would be great if OFI implements this logic since it is a one stop shop over all network devices and providers.


Second, multirail support is a hit and miss across MPI implementations.  Intel Omnipath PSM2 library actually did a great job here by implementing multirail support at the PSM2 level. This means all above layers like MPI would get this functionality for free.  Again, given that many MPI implementation can be built on top of OFI,  It would be also great if OFI has multirail support.

Thank you


--
Daniel Faraj
HPE Performance Engineering
651.683.7605 Office
daniel.faraj at hpe.com<mailto:daniel.faraj at hpe.com>

[cid:image001.png at 01D2DA03.5C5447B0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofiwg/attachments/20170531/9de4d6ed/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 3999 bytes
Desc: image001.png
URL: <http://lists.openfabrics.org/pipermail/ofiwg/attachments/20170531/9de4d6ed/attachment.png>


More information about the ofiwg mailing list