[ofiwg] OFIWG notes 10/1
Ingerson, Alexia
alexia.ingerson at intel.com
Thu Oct 3 12:12:48 PDT 2024
10/01/2024
Participants:
Jianxin Xiong (Intel)
Amir Shehata (ORNL)
Alex McKinley (Intel)
Alexia Ingerson (Intel)
Ben Lynam (Cornelis)
Howard Pritchard (LANL)
Jerome Soumagne (Intel)
John Byrne (HPE)
Juee Desai (Intel)
Nathan Hanford
Nikhil Nanal (Intel)
Peinan Zhang (Intel)
Stephen Oost (Intel)
Steve Welch (HPE)
Zach Dworkin (Intel)
Summary:
OFI 2.0 beta plan on track. Early goal is 10/19 with a later goal of 10/25. There will be no RC and no branch.
Discussion on adding fi_inject2 set of calls which would take in the additional descriptor parameter in order to support FI_HMEM memory with inject calls. There was discussion related to what would happen with backwards compatibility - OMPI already calls fi_inject with FI_HMEM and some providers, like cxi add iface detection and caching to support FI_HMEM through inject APIs. Conclusion was fi_inject2 would not pass back to fi_inject with a NULL descriptor. No objections. New calls targeted for beta release.
Notes:
OFI 2.0 beta plan:
* Early goal Friday 10/18
* If more time needed Friday 10/25
* Procedure similar to alpha - no RC, no branch
New inject calls
* Existing inject calls don't pass in desc for the buffer
* fi_inject(ep, buf, len, dest_addr);
* Assumption is it's host memory
* Issues:
* Difficult to support FI_HMEM (missing iface info, provider needs to detect iface in order to support FI_HMEM)
* Providers may have to report inject size of 0 when FI_HMEM is enabled
* Compromise made by the lnx provider (has to report inject size 0 when one peer provider reports inject size of 0)
* Performance loss for traffic going through the other peer
* Define new set of inject calls that pass in desc for the buffer
* fi_inject2(ep, buf, desc, len, dest_addr);
* fi_injectdata2(ep, buf, desc, len, data, dest_addr);
* fi_tinject2(ep, buf, desc, len, dest_addr, tag);
* fi_tinjectdata2(ep, buf, desc, len, dest_addr, tag);
* fi_inject_write2(ep, buf, desc, len, dest_addr, addr, key);
* fi_inject_writedata2(ep, buf, desc, len, data, dest_addr, addr, key);
* fi_inject_atomic2(ep, buf, desc, count, dest_addr, addr, key, datatype, op);
* OMPI would be able to plug into new call
* Q: fi_inject2 would pass through to fi_inject with NULL descriptor - how do we know it's NULL because it's a host buffer or because the middleware isn't correctly using API
* Do we need to define fi_inject call is not valid for FI_HMEM?
* Do providers need to add iface detection for regular inject?
* OMPI doesn't register inject buffers - does CXI provider have caching to support mapping and iface? Yes, but it takes a performance hit; they would like to move to these calls
Will try to get fi_inject2 calls into beta release
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofiwg/attachments/20241003/9243518f/attachment.htm>
More information about the ofiwg
mailing list