[ofiwg] FI_HMEM and fi_inject

Hefty, Sean sean.hefty at intel.com
Thu Sep 3 22:32:51 PDT 2020

> An issue popped up as part of the code review for
> https://github.com/ofiwg/libfabric/pull/6185
> The summary is that supporting FI_HMEM (e.g. GPU memory buffers) may have a negative
> impact on the performance of fi_inject() calls.  Either the provider must disable
> inject completely, or possibly check the buffer location on each inject call.

There are comments in the above issue about possible options.

But I'm actually going to propose an alternative, which is we do nothing, other than possibly document that the use of fi_inject() is not recommended for non-system memory.

First, fi_inject() should work with device buffers with no special handling.  Fi_inject() is usually implemented using memcpy, which might be slower than a specialized copy routine, but the impact is unknown at this point.  Second, fi_inject() is intended for small transfers.  It's unknown to me if there's a significant need to perform small transfers to/from device memory, such that fi_inject() would naturally be used by apps for device buffers.  I think we need quantified data for both of these showing a need before creating an optimized device inject path.

Removing support for device buffers from fi_inject completely likely forces memory registration of those buffers.  It's highly likely that the cost of registration will impact performance greater than using a non-optimized memory copy routine.

- Sean

More information about the ofiwg mailing list