[ofiwg] utility component/providers

Wed Feb 22 11:16:22 PST 2017

> From the perspective of a provider writer, I like the last option
> where the core providers can pick and choose utility components.  I'm
> not sure what granularity you were thinking of, but it would be nice
> for it to be coarser than the way utility providers are today.  For
> example, on Cray XC systems the native atomic operations are only for
> 32- and 64-bit operands, so it'd be nice to pick up a "utility
> version" for the other atomic data types.

The utility code is separate from an actual provider.  There are essentially 'base class' implementations for the various OFI constructs - CQs, EQs, EPs, etc - which in turned are implemented using primitive constructs and macros - list, rbtree, queue,...  Unfortunately, the base classes and primitives were developed after many of the existing providers, but they are provider independent.

The most important addition that the utility _providers_ add to this is implementing a protocol to support one endpoint type over another.  The protocol implementation is not easily separated out from the APIs being accessed.  In the case of the utility providers, those APIs are the OFI interfaces themselves, which allows it to be implementation agnostic. Attempting to separate one feature of a protocol (e.g. atomics) from the others (e.g. RMA and messaging) appears difficult, if not impossible.  I think the best you can do is define helper routines that multiple providers could call, and hope that those routines are protocol independent.

A core provider could do similar work as the utility provider, but optimized for specific hardware.  It can re-use the existing utility code and protocol headers, where possible.  The problem is that this adds development time on the core provider developers...

I see the co-existence of basic components provider-level utilities, and optimized core providers as desirable.  The utility provider has two objectives: close the provider feature gap, and reduce the provider barrier for entry.  For instance, the RXD provider will allow usNIC to run with any MPI, allows using UDP rather than TCP sockets for better scalability, and would work with InfiniBand UD QPs - also for better scaling.  No additional work in those providers are needed.

> From the perspective of a client of libfabric (which is what I am of
> late), I would like some way to know when a utility (or maybe simply
> non-native) version of some operation is begin used.  Using the
> example of atomics again, if I knew that the atomic implementation was
> non-native, I would chose to use my own implementation (at least for
> the Chapel port to libfabric).

If an end-user wants to know when utility components are being used, that suggests its use should not be hidden.  My immediate concern is that Cisco has asked that the provider reported to the app be "usnic", and not "ofi-rxd".  The problem is deciding how to deal with enabling/disabling providers and log messages, since ofi-rxd is for all reasonable purposes an actual provider.  I was hoping that the comp_list idea would satisfy both exposing its use while still reporting the name of the core provider.  It just doesn't work out as a very clean solution.

- Sean