[ofiwg] OFIWG Meeting 2/24/2026 Minutes
Xiong, Jianxin
jianxin.xiong at intel.com
Wed Feb 25 12:24:15 PST 2026
2/24/2026
** Participants **
Alex McKinley (Intel)
Alexia Ingerson (Intel)
Bob Cernohous (Cornelis Networks)
Howard Pritchard
Jerome Soumagne (HPE)
Jianxin Xiong (Intel)
John Byrne [HPE]
Ken Raffenetti (ANL)
Sai Sunku [AWS]
Shi Jin (AWS)
Stephen Oost (Intel)
Zach Dworkin (Intel)
rajalaxmi
** Notes **
* Release v2.5.0 update
- RC1 is expected in less than a week
- Please mark PRs that need to get into RC1 with the "for-2.5.x" label
- New shm provider (PR#11877) is now cleared for v2.5.0. The IMB-EXT accumulate test runs very slow with Open MPI.
This test is to be disabled in AWS CI and a new Open MPI issue is to be created to track this.
- opx provider update is being worked on, may not be ready this week
* Discussion on IPC support for allocation made by cudaMallocAsync()
- cudaMallocAsync() and some other allocation routines use new CUDA driver VMM API. The resulting buffer does not
work with cudaMemGetIpcHandle() (or the driver API cuIpcGetMemHandle()).
- New API calls are needed to export / import sharable handle for IPC
- There is no way to identify the method of allocation for a given device pointer
- Could add the new calls as a fallback path: extra overhead
- Can the new API replace the old API completely? Need to verify
- The new API may export shared handle as file descriptor. Need extra steps to pass to other processes:
pidfd, or control message in sockets
Jianxin Xiong
Fabric Software
Intel Corporation
Jianxin.xiong at intel.com
More information about the ofiwg
mailing list