[libfabric-users] feature requests
Faraj, Daniel
daniel.faraj at hpe.com
Mon Jun 5 21:29:52 PDT 2017
If MPI or other middle layer is to implement the multirail, why bother even with OFI: implement directly on the device and no need for extra OFI overhead.
My recollection of OFI objective is a one stop shop that upper layer utilize.
Forget about OFI for a minute. Consider PSM2: multirail is supported at PSM2 level, that means every single MPI flavor by default has multirail support. That is what I call a real one stop shop.
--
Daniel Faraj
HPE Performance Engineering
651.683.7605 Office
daniel.faraj at hpe.com
On 6/5/17, 12:26 PM, "Hefty, Sean" <sean.hefty at intel.com> wrote:
> "Domains usually map to a specific local network interface adapter. A
> domain may either refer to the entire NIC, a port on a multi-port NIC,
> or a virtual device exposed by a NIC. From the viewpoint of the
> application, a domain identifies a set of resources that may be used
> together." (https://github.com/ofiwg/ofi-
> guide/blob/master/OFIGuide.md)
>
> From this, MPI libraries and the like would then need to support
> multiple domains.
The intent is that OFI could support multi-rail both above and below the interface, depending on which one the application wants. We didn't want to restrict what the provider could do, as long as it met the API definition.
I do have concerns that implementing multi-rail below OFI has the *potential* of negatively impacting performance when multi-rail is not available or enabled. So care is needed to avoid this. This seems more likely to be a problem in devices that support full transport offload.
- Sean
More information about the Libfabric-users
mailing list