[ofiwg] lifabric provider extensions

Sur, Sayantan sayantan.sur at intel.com
Mon Nov 3 14:06:33 PST 2014


> I would say the "size" field in the ops struct needs to increase monotonically.
> i.e. if an extension is ever added, the only way to remove it is by replacing it
> with a dummy function that returns -FI_ENOSYS.  The app can check that the
> extensions it might use are present or not by checking the size of the struct
> to see that it is as large as expected, i.e. whether the expected function calls
> are present or not.  Finally, it might be prudent for a provider writer to have
> the first extension for any ops structure be a function that returns version
> information, or possibly just use the "provider version" information from
> fi_getinfo.  There are plusses and minuses when choosing between granular
> function-level versioning vs. global provider version.
>

I like the scheme proposed in general. fi_<provider name>.h seems like good fit to expose any provider specific functionality.

Why can't extensions/backwards compat/etc. for the provider specific schemes follow the same format as rest of libfabric?

fi_<provider name>_getinfo(FI_VERSION(x,y))

...

The general rules regarding ABI compatibility apply. If version is a dot-release, it is ABI compatible (extensions to structs only). When ABI changes (or calls disappear), then change the major number.

The real issue is what guarantees does the user get as soon as they use a provider specific API? Does the provider specific API return/reuse the same types of objects as the main API? If so, are these objects reusable by the "standard provider"? (I would hope not!!).
 
> > Also, how do you avoid the possibility of SFI being (effectively) forked if a
> > popular provider exposes most of its optimizations via provider extensions
> 
> IMO - the answer to this is about our design process as opposed to coding
> practices.  If we do a good job maintaining the balance between generality
> and efficiency in the overall API, providers will not feel compelled to resort to
> extensions.  (extensions stink for everyone).  Providing general and friendly
> mechanisms for providers to expose optimizations is critical to avoid this
> splintering of the API into private extensions.
> 

+1.

IMHO it is a good idea to allow people to experiment and expose new APIs. If users wanted to go to vendor specific APIs they already have the choice.

Sayantan.

> -r
> 
> 
> > -----Original Message-----
> > From: Blocksome, Michael [mailto:michael.blocksome at intel.com]
> > Sent: Monday, November 03, 2014 10:18 AM
> > To: Reese Faucette (rfaucett); Jeff Squyres (jsquyres)
> > Cc: ofiwg at lists.openfabrics.org
> > Subject: RE: [ofiwg] lifabric provider extensions
> >
> > Regarding capability #defines in provider-specific headers for extensions ..
> >
> > What if the application is compiled against one of these headers, but is later
> > linked with a (new) provider .so on the system that has removed the
> > extension. Is a provider library allowed to remove extensions, or must the
> > extension symbols be maintained?  Is this simply the cost, or risk, with
> using
> > provider extensions and the application should expect to have to
> > recompile?
> >
> > Also, how do you avoid the possibility of SFI being (effectively) forked if a
> > popular provider exposes most of its optimizations via provider extensions
> > and applications are forced to use these extensions to get the expected
> > performance on this provider's fabric?
> >
> > Thanks.
> >
> > -----Original Message-----
> > From: ofiwg-bounces at lists.openfabrics.org [mailto:ofiwg-
> > bounces at lists.openfabrics.org] On Behalf Of Reese Faucette (rfaucett)
> > Sent: Sunday, November 2, 2014 11:34 PM
> > To: Jeff Squyres (jsquyres)
> > Cc: ofiwg at lists.openfabrics.org
> > Subject: Re: [ofiwg] lifabric provider extensions
> >
> > Well, I am throwing it out as a candidate for "standard / common practice" -
> I
> > have not actually tried implementing against it or given it the depth of
> > analysis to confidently advocating for it yet.  I think that with the "group-
> > think" from this list we can come up with something pretty good.
> >
> > Yes, probably worth also standardizing on the #defines app src code  would
> > look for regarding extensions.
> >
> > -r
> >
> > > -----Original Message-----
> > > From: Jeff Squyres (jsquyres)
> > > Sent: Saturday, November 01, 2014 4:29 AM
> > > To: Reese Faucette (rfaucett)
> > > Cc: ofiwg at lists.openfabrics.org
> > > Subject: Re: [ofiwg] lifabric provider extensions
> > >
> > > Just to be clear: you're advocating that this become the standard /
> > > common practice, right?  I.e., that "fi_<provider_name>.h" be standard
> > > form for header files for provider-provided extensions, right?
> > >
> > > Hence, configure-providing software can do things like
> AC_CHECK_HEADER
> > > to see if various provider extensions are available.
> > >
> > > Is it worth having some additional #define's of a common form in
> > > fi_<provider_name>.h to identify what objects have extensions?  E.g.,
> > > fi_xyzzy.h could "#define FI_XYZZY_AV_OPS 1" so that and app can
> > > programmatically know that fi_ops_av_xyzzy exists.
> > >
> > >
> > >
> > > On Oct 31, 2014, at 1:07 PM, Reese Faucette (rfaucett)
> > > <rfaucett at cisco.com>
> > > wrote:
> > >
> > > > I have this idea of how providers might add calls in the rare event
> > > > its
> > > needed, but would like to run it by the list for comments.
> > > > Let's say provider xyzzy supportsa new call AV-related call,
> > > "fi_av_xyzzy_add_flux()".  The would be a new header "fi_xyzzy.h" and
> > > "xyzzy_av.c" as shown below.
> > > >
> > > > Does this seem sane?  I think the "version" passed to fi_getinfo can
> > > > be
> > > used to deal with fi_ops_av growing wrt. Backwards compatability.
> > > >
> > > > Thanks,
> > > > -reese
> > > >
> > > > ======= fi_xyzzy.h ========
> > > >
> > > > struct fi_ops_av_xyzzy {
> > > >        struct fi_ops_av base_ops;
> > > >        int (*add_flux)(struct fid_av *av, ssize_t flux_value); };
> > > >
> > > > static inline int
> > > > fi_av_xyzzy_add_flux(struct fid_av *av, ssize_t flux_value) {
> > > >        return ((struct fi_ops_av_xyzzy *)av->ops)->
> > > >               add_flux(av, value);
> > > > }
> > > >
> > > > ====== xyzzy_av.c ==========
> > > >
> > > > struct fi_ops_av_xyzzy xyzzy_av_ops = {
> > > >        .base_ops = {
> > > >               .size = sizeof(struct fi_ops_av_xyzzy),
> > > >               .insert = xyzzy_av_insert,
> > > >               ...
> > > >        },
> > > >        .add_flux = xyzzy_av_add_flux; };
> > > >
> > > > int xyzzy_av_open(...)
> > > > {
> > > >        ...
> > > >        av->ops = (struct fi_ops_av *)&xyzzy_av_ops;
> > > >        ...
> > > >        return 0;
> > > > }
> > > >
> > > > _______________________________________________
> > > > ofiwg mailing list
> > > > ofiwg at lists.openfabrics.org
> > > > http://lists.openfabrics.org/mailman/listinfo/ofiwg
> > >
> > >
> > > --
> > > Jeff Squyres
> > > jsquyres at cisco.com
> > > For corporate legal information go to:
> > > http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> > _______________________________________________
> > ofiwg mailing list
> > ofiwg at lists.openfabrics.org
> > http://lists.openfabrics.org/mailman/listinfo/ofiwg
> _______________________________________________
> ofiwg mailing list
> ofiwg at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/ofiwg



More information about the ofiwg mailing list