[ofiwg] thoughts on initializing libfabric structs

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Mon Jul 28 16:46:44 PDT 2014


On Mon, Jul 28, 2014 at 11:07:50PM +0000, Hefty, Sean wrote:
> > That sounds nice and simple, but you need to consider the effects of
> > shared libraries:
> > 
> > main program :
> >   res = fi_getinfo(API_VER = 2)
> >   some_solib(res)
> > 
> > some_solib():
> >   struct fabric_foo_rel1 = {}
> >   fi_something(res,&rel1);
> > 
> > Oops, now it blows up, because main program and same_solib are not
> > using the same libfabric ABI revision.
> 
> This seems solvable if the app conveys its FI version to the shared library.

Hard to do if objects allocated by libfabric can be freely passed into
and out of the shared library, you really have to tag the call site.

> - Attributes: allocated by the app (likely on the stack) and used as input/output to allocate and configure objects.
> - 'Msg' structures: used in generic data transfer operations (~ sendmsg)
> - Objects instances: ~ C++ class, allocated by provider
> 
> The situation is a little different than traditional libraries,
> since this is designed as a pass-through library.  (Provider
> libraries may be built-in, but are not required to be.)  For each of
> the cases above:

That thinking is part of what made verbs so sticky.

I agree there are a few calls that need to be high performance, and
should be passed through quickly, but for everything else, libfabric
is a library that exposes entry points that mux onto the provider
library.

The libfabric entry points should be responsible for translation and
fixing. If I call fabric_foo(ABI_REV=2) and the provider only has
ABI_REV=1 then libfabric can look at the incoming data and decide if
it must return ENOSYS or if it can transparently translate the call to
ABI_REV=1 and call out to the provider.

Jason



More information about the ofiwg mailing list