[ofiwg] [RFC] libfabric provider interoperability
Hefty, Sean
sean.hefty at intel.com
Fri Dec 5 22:45:42 PST 2014
> I think a decent solution to the interoperability problem would be as
> follows: providers which claim to implement the same *existing* wire
> protocol must interoperate for all calls that are *natively* supported
> by the wire protocol. In fi_endpoint(3) man page under the description
> of a wire protocol, we should list what the natively supported features
> are considered to be to avoid ambiguity. The two emphasized words add
> meaning that may not have been explicitly mentioned during the last
> OFIWG call.
I like the idea of explicitly listing what features are supported by the various protocols. I would just document that in protocol specific man pages.
> So, as an example, for InfiniBand RC, this would require that two
> providers implement the three atomic operations that are defined in the
> InfiniBand specification using the wire protocol's native support, but
> for all of the other atomics that libfabric provides, all bets are
> off---they might work within a single provider but not between providers.
It's worth noting that there is not a requirement for a provider to implement all defined operations. Specifically, for atomic ops, a given provider may not implement anything other than the base atomics.
I would also point out that the protocol isn't limited to just the data transfer APIs. The connection protocol and address resolution mechanism must also be compatible.
> If two vendors wished to implement the rest of the atomics in InfiniBand
> in an interoperable way, then they would declare a new extended wire
> protocol as we discussed in the call. The providers would still be
> responsible for interoperability of the calls that are natively
> supported by InfiniBand. This would require that we define a way to
> state that a protocol is merely an extension of an existing wire
> protocol. Perhaps we could have some place in libfabric or on the
> website to document these "agreements between friends" wire protocols.
> And/or these ultimately may be taken to, e.g., the IBTA, if it was
> decided that a real standard solution was needed.
One of the difficulties with extending a base protocol is doing so in such a way that the native operations are no longer affected. For example, adding support for other atomic operations would require changes to the messaging protocol, which would affect native support.
- Sean
More information about the ofiwg
mailing list