<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Aptos;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Aptos",sans-serif;
mso-ligatures:standardcontextual;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Aptos",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:11.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:198054927;
mso-list-type:hybrid;
mso-list-template-ids:-2104328892 1865024836 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-start-at:2;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Aptos",sans-serif;
mso-fareast-font-family:Aptos;
mso-bidi-font-family:"Times New Roman";}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l1
{mso-list-id:1452282075;
mso-list-type:hybrid;
mso-list-template-ids:-842082774 -694368614 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l1:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Aptos",sans-serif;
mso-fareast-font-family:Aptos;
mso-bidi-font-family:"Times New Roman";}
@list l1:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l1:level3
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l1:level4
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l1:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l1:level6
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
@list l1:level7
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Symbol;}
@list l1:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:"Courier New";}
@list l1:level9
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
font-family:Wingdings;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#467886" vlink="#96607D" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">10/29/2024<o:p></o:p></p>
<p class="MsoNormal"><b><u>Participants:<o:p></o:p></u></b></p>
<p class="MsoNormal">Alexia Ingerson (Intel)<o:p></o:p></p>
<p class="MsoNormal">Jianxin Xiong (Intel)<o:p></o:p></p>
<p class="MsoNormal">Alex McKinley (Intel)<o:p></o:p></p>
<p class="MsoNormal">Ben Lynam (Cornelis)<o:p></o:p></p>
<p class="MsoNormal">Charles Shereda (Cornelis)<o:p></o:p></p>
<p class="MsoNormal">Howard Pritchard (LANL)<o:p></o:p></p>
<p class="MsoNormal">Ian Ziemba (HPE)<o:p></o:p></p>
<p class="MsoNormal">Juee Desai (Intel)<o:p></o:p></p>
<p class="MsoNormal">Nikhil Nanal (Intel)<o:p></o:p></p>
<p class="MsoNormal">Peinan Zhang (Intel)<o:p></o:p></p>
<p class="MsoNormal">Shi Jin (AWS)<o:p></o:p></p>
<p class="MsoNormal">Stephen Oost (Intel)<o:p></o:p></p>
<p class="MsoNormal">Steve Welch (HPE)<o:p></o:p></p>
<p class="MsoNormal">Zach Dworkin (Intel)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><b><u>Summary:<o:p></o:p></u></b></p>
<p class="MsoNormal">Libfabric 2.0 beta was released on 10/25, RC1 for 2.0 GA planned from 11/22/2024. Please ifx coverity and double check provider man pages to make sure everything is up to date.<o:p></o:p></p>
<p class="MsoNormal">The two implementations of GPU RDMA with CUDA and verbs were discussed (peer memory available only in MOFED which has an extra API call, and using dmabuf where the application needs to register memory with the address as well as the file
descriptor in order to register with the RMDA driver).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><b><u>Notes:<o:p></o:p></u></b></p>
<p class="MsoNormal">2.0 status and plan:<o:p></o:p></p>
<ul style="margin-top:0in" type="disc">
<li class="MsoNormal" style="mso-list:l1 level1 lfo1">2.0 beta went out Friday (10/25)<o:p></o:p></li><li class="MsoNormal" style="mso-list:l1 level1 lfo1">Planned new features all in<o:p></o:p></li><li class="MsoNormal" style="mso-list:l1 level1 lfo1">RC 1 for 2.0 GA is planned Fri. November 22<o:p></o:p></li><ul style="margin-top:0in" type="circle">
<li class="MsoNormal" style="mso-list:l1 level2 lfo1">Focus on testing and fixing bugs<o:p></o:p></li><li class="MsoNormal" style="mso-list:l1 level2 lfo1">Coverity, lots in opx, lpp, lnx, psm3<o:p></o:p></li><li class="MsoNormal" style="mso-list:l1 level2 lfo1">Double check man pages, especially provider man pages, make sure everything is up to date<o:p></o:p></li></ul>
</ul>
<p class="MsoNormal">GPU RDMA with CUDA and verbs:<o:p></o:p></p>
<ul style="margin-top:0in" type="disc">
<li class="MsoNormal" style="mso-list:l0 level1 lfo2">2 different mechanisms: use dmabuf, use peer memory (old, part of MOFED, has extra API called peer mem)<o:p></o:p></li><ul style="margin-top:0in" type="circle">
<li class="MsoNormal" style="mso-list:l0 level2 lfo2">Old way: peer memory<o:p></o:p></li><ul style="margin-top:0in" type="square">
<li class="MsoNormal" style="mso-list:l0 level3 lfo2">Driver will query all clients to see who owns the address, driver will do address translation for NIC – only available with MOFED, not available with distro kernel driver<o:p></o:p></li></ul>
<li class="MsoNormal" style="mso-list:l0 level2 lfo2">New way: use dmabuf GPU driver and RDMA driver<o:p></o:p></li><ul style="margin-top:0in" type="square">
<li class="MsoNormal" style="mso-list:l0 level3 lfo2">Application needs to get address as well as file descriptor. Client uses different API to register memory which will interact with dma driver<o:p></o:p></li></ul>
</ul>
</ul>
<p class="MsoNormal">Q: When app only passes address, not fd?<o:p></o:p></p>
<p class="MsoNormal">A: Currently not available but something we can add<o:p></o:p></p>
<p class="MsoNormal">Efa tried to do it but had to revert. Can only use dmabuf when using the NCCL dmabuf API<o:p></o:p></p>
<p class="MsoNormal">As long as app registers address with correct iface, libfabric can query this<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">In order to enable dmabuf for CUDA:<o:p></o:p></p>
<ul style="margin-top:0in" type="disc">
<li class="MsoNormal" style="mso-list:l0 level1 lfo2">Requires linux kernel 5.12 and later<o:p></o:p></li><li class="MsoNormal" style="mso-list:l0 level1 lfo2">Requires CUDA 11.7 and later<o:p></o:p></li><li class="MsoNormal" style="mso-list:l0 level1 lfo2">Installed with “-m=kernel_open” flag<o:p></o:p></li><li class="MsoNormal" style="mso-list:l0 level1 lfo2">Check CUDA dmabuf support: cuDeviceGetAttribute()<o:p></o:p></li><li class="MsoNormal" style="mso-list:l0 level1 lfo2">Get dmabuf fd for CUDA allocated buffer: cuMemGetHandleForAddressRange()<o:p></o:p></li></ul>
<p class="MsoNormal">Verbs provider will look at flags, kernel support for dmabuf, iface, etc, and will either go through ibv_reg_mr or ibv_reg_dmabuf_mr<o:p></o:p></p>
<p class="MsoNormal">Q: How do you detect kernel support for peer memory<o:p></o:p></p>
<p class="MsoNormal">A: Also check the kernel symbol: ib_register_peer_memory_client<o:p></o:p></p>
<p class="MsoNormal">Q: NIC and peer memory registration can fail, so doing kernel check is not sufficient. Trial and error<o:p></o:p></p>
<p class="MsoNormal">A: verbs takes a simple way and checks for call and can fall back if it fails later<o:p></o:p></p>
<p class="MsoNormal">Q: efa tries registering on every iface to see if available<o:p></o:p></p>
<p class="MsoNormal">Currently if you want to usee dma support for CUDA/verbs you have to use peer memory approach. Dmabuf is supported for CUDA but ofi doesn’t check CUDA, propose adding CUDA to dmabuf ifaces<o:p></o:p></p>
<p class="MsoNormal">Fabtests has option to allocate device memory (“-D cuda”) and option to register with FI_MR_DMABUF flag (“-R”), see common/shared.c<o:p></o:p></p>
<p class="MsoNormal">Q: no way for application to detect whether peer memory or dmabuf is available. Would it be good to have an API to report that?<o:p></o:p></p>
<p class="MsoNormal">A: can include it in logging.<o:p></o:p></p>
<p class="MsoNormal">Q: we already do, tried to add global api (in fi_info) but was rejected<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</body>
</html>