<div dir="ltr">Hello Jess,<div><br></div><div>Your arbitration request regarding the NFSoRDMA and RSockets issues found in the January 2015 Interop Logo Event has been approved and an updated report for Intel Infiniband HCAs is attached. I also attached the Intel Switch Report. Please review both and if you are satisfied with the content respond with explicit consent to post these reports to the OpenFabrics Interoperability Logo List. Please contact me with any questions or concerns.</div><div><br></div><div>Thank you,</div><div>Dave Wyman</div><div>UNH-IOL OpenFabrics Interoperability Logo Group</div><div><br></div><div><br></div><div><br><div><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jun 3, 2015 at 3:10 PM, Calciano, Jess <span dir="ltr"><<a href="mailto:jess.calciano@intel.com" target="_blank">jess.calciano@intel.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Hello,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Since the original arbitration request was submitted, there’s been some further discussion about the RSockets failure. With the fix for librdmacm described in
the original request, rstream ran successfully for most message sizes, but still hung with -S 1024.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Additional investigation traced the new problem to an incompatibility between the qib driver and the ibv_create_qp() function. A workaround (described below)
is available for the current OFED version and a permanent fix to librdmacm will be included in the next OFED 3.18 release.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Details:<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal" style="margin-left:.5in">The ultimate issue is still related to the qib driver being non-compliant with the ibv_create_qp() definition:<br>
<br>
<span> </span>The function ibv_create_qp() will update the qp_init_attr->cap<br>
<span> </span>struct with the actual QP values of the QP that was created;<br>
<span> </span>*** the values will be greater than or equal to the values requested. ***<br>
<span> </span><br>
Specifically, the qib driver will return an inline size that is smaller than that requested. Rsockets has code to trap for this, but the rsockets code looks like this:<br>
<br>
<span> </span>inline_size = SOME_DEFAULT_LIKE_64<br>
<span> </span>rs_init_bufs(...);<br>
<span> </span>...<br>
<span> </span>rs_create_qp(...);<br>
<span> </span>inline_size = qp_cap->max_inline_size;<br>
<br>
The issue is that rs_init_bufs(), which allocates the buffers and registers the memory, uses the default inline size. The net result is that rsockets ends up referencing memory that is outside of the registered memory region when sending credit updates. The
lost credit update is causing the hang that you see.<br>
<br>
A quick check shows that I can move the rs_init_bufs() call after the qp has been created and have the test work. You should also be able to override the inline_size by writing the value 0 into a config file. This will set the inline_size to 0 as the default.
To do this, you need to write a 0 into /etc/rdma/rsocket/inline_default. (The actual path will depend on your configuration, so it could be under /usr/etc/rdma/... for example.) Updating the config file should work with the current version.<br>
<br>
I will provide an update to the librdmacm to handle this. That update will find its way into the 3.18 release.<br>
<br>
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Thanks,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Jess Calciano<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><a name="14dbad57595896c8__MailEndCompose"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></a></p>
<div>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Calciano, Jess
<br>
<b>Sent:</b> Wednesday, April 08, 2015 2:39 PM<br>
<b>To:</b> <a href="mailto:iwg-arbitration-committee@openfabrics.org" target="_blank">iwg-arbitration-committee@openfabrics.org</a><br>
<b>Cc:</b> OFA Lab Mailing List; Dave Wyman; Rupert Dance <<a href="mailto:rsdance@soft-forge.com" target="_blank">rsdance@soft-forge.com</a>> (<a href="mailto:rsdance@soft-forge.com" target="_blank">rsdance@soft-forge.com</a>); Cole, Cliff; Mascarenhas, Edward; Sharma, Karun; Thete, Swapna; Hefty, Sean; Yan, Philip W; Flores, Jose F<br>
<b>Subject:</b> Arbitration request for Intel QLE7340 & QLE7342 HCAs (Jan 2015 OFA Interop Logo Event)<u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Hello,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Intel would like to file an arbitration request for the January 2015 OFA Interop Logo Event results for the Intel QLE7340 and QLE7342 HCAs.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">The provided report (attached for reference) shows two failing tests:
<u></u><u></u></span></p>
<p><u></u><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><span>1)<span style="font:7.0pt "Times New Roman"">
</span></span></span><u></u><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">TI NFS over RDMA<u></u><u></u></span></p>
<p><u></u><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><span>2)<span style="font:7.0pt "Times New Roman"">
</span></span></span><u></u><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">TI RSockets<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">The Intel team has investigated these results and determined that the failures are due to bugs in non-Intel components.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">NFSoRDMA:<u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:.25in"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">The failure is due to a known Connectathon issue, documented here:<u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:.25in"><a href="http://www.spinics.net/lists/linux-nfs/msg16460.html" target="_blank"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">http://www.spinics.net/lists/linux-nfs/msg16460.html</span></a><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">RSockets:<u></u><u></u></span></p>
<p class="MsoNormal" style="margin-right:0in;margin-bottom:12.0pt;margin-left:.25in">
<span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#002060">The issue is that ibv_modify_qp() is failing. The problem is that an incorrect bit is set in the qp_attr_mask, which is returned from the kernel. With Intel, bit 21 of the qp_attr_mask
is set. This is not the case for a Mellanox HCA.<br>
<br>
Bit 21 is not defined for userspace. However, it was defined in the kernel as IB_QP_SMAC.<br>
<br>
If the librdmacm is modified to mask out this bit, the call succeeds and rstream runs successfully.</span><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Please let me know if the arbitration committee needs any additional information on the analysis.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Thanks,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Jess Calciano<u></u><u></u></span></p>
</div>
</div>
</blockquote></div><br></div></div></div></div>