[ofiwg] thoughts on initializing libfabric structs

Jason Gunthorpe jgunthorpe at obsidianresearch.com
Mon Jul 28 14:49:48 PDT 2014


On Mon, Jul 28, 2014 at 09:34:12PM +0000, Hefty, Sean wrote:
> Thanks, Bob, for looking into this.
> 
> > In practice, you'd want to exactly preserves each structure as it was
> > at each release:
> > 
> >  struct fabric_foo_rel1 {...};
> >  struct fabric_foo_rel2 {...};
> >  struct fabric_foo_rel3 {...};
> >  typedef fabric_foo_rel3 fabric_foo;
> 
> I was starting to reach this same conclusion.
> 
> With libibverbs, we were restricted by backwards compatibility.  A
> new library doesn't have that restriction, so we should be able to
> define a simple, reasonable model up front.
> 
> I was wondering if we even needed version information stored per
> structure.  libfabric has what is essentially an initial entry point
> (fi_getinfo).  We could add a version parameter (corresponds to
> minor version number) to that call, which could be used by provider
> to select the correct data structures known to the app.  Providers
> that don't support the specified version would simply fail the call.

That sounds nice and simple, but you need to consider the effects of
shared libraries:

main program :
  res = fi_getinfo(API_VER = 2)
  some_solib(res)

some_solib():
  struct fabric_foo_rel1 = {}
  fi_something(res,&rel1);

Oops, now it blows up, because main program and same_solib are not
using the same libfabric ABI revision.

Designing a good ABI is a tricky thing. The best guidelines:
 - Avoid structures
 - Keep structures opaque. The consumer only sees an anonymous
   pointer, allocation is done in the library and never exposed
   outside the library.
 - When the call really requires an on-stack structure or structure pointer
   as an argument, rely on symbol versioning at the call site to
   resolve the caller's ABI. If the call site is a function pointer,
   have the indirection inline wrapper apply an ABI tag
 - Always have the library allocate structure memory

Look closely at everything, and decided if it *really* needs a
structure, if the structure can be on-stack and symbol versioning will
do the job, if the path is low speed and an approach like pthread_attr
is more appropriate.

A big chunk of the problem with verbs is that everything was exposed
as structures visible-to and sometimes allocated-by the wrong things.

Jason



More information about the ofiwg mailing list