[ofiwg] libfabric releases, stable branches, and DL providers

Hefty, Sean sean.hefty at intel.com
Mon Aug 3 12:49:53 PDT 2020

I have the following proposal for creating a provider release package for Linux.  This should allow mostly independent provider and libfabric releases, and ease provider backporting.

Detailed steps:
1. All code needed by the provider needs to be available under libfabric/prov/X.
1a.Add symbolic links for 'common_srcs':

   libfabric/prov/X/prov/util  -> libfabric/prov/util
   libfabric/prov/X/include    -> libfabric/include
   libfabric/prov/X/src/shared -> libfabric/src/shared
   additional links for libfabric/src/ files (common.c, enosys.c, mem.c, etc.)

   (Symbolic links are not available on all platforms, which is why
   they are not currently used.)

2. Add Makefile.am and configure.ac files for libfabric/prov/X.
   Each provider would manage their own build files.

3. Update the provider's getinfo call to handle ABI structure changes
   e.g. see src/abi_1_0.c

4. Report API version based on libfabric target version.
   Determined during configure or make process

Restructuring the code and build files would help make this more manageable, but isn't required.  E.g. moving shared code from src -> src/shared, adding a Makefile.include for prov/util, etc.

The provider would build against their own copy of the headers to support multiple libfabric versions.  API compatibility code is already built into the providers, mostly through prov/util.  Today, the libfabric core handles ABI conversions, so that the provider's getinfo call only sees the latest version of the structures.  Step 3 requires DL providers to handle this translation as well.

The above changes could integrate into master.  Each provider could also create their own stable release branch(es) to manage as desired.  One possible work flow would be:

1. Align the provider branch with the latest major libfabric release.
2. Release a new provider package to support older libfabric releases. 
3. Cherry pick patches from master to provider branch as needed.
4. Release provider update packages as needed.
5. Restart at step 1 each major libfabric release.

This sort of flow allows the provider to focus on mostly linear development model.  There would be no need to backport fixes to a provider to multiple stable branches.  A single provider release would be able to support all previous libfabric versions.  Rebasing provider release branches to align with the latest libfabric release would ensure that only released APIs are ever used.

There's roughly a 40% chance that I'm missing some critical item that makes this completely fall apart.

- Sean

More information about the ofiwg mailing list