<div dir="ltr">I forgot to CC the list, here is my reply to Arun:<div><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Valentino Picotti</b> <span dir="ltr"><<a href="mailto:valentino.picotti@gmail.com">valentino.picotti@gmail.com</a>></span><br>Date: 8 September 2016 at 14:00<br>Subject: Re: [libfabric-users] Optimisation Tips for verbs provider<br>To: "Ilango, Arun" <<a href="mailto:arun.ilango@intel.com">arun.ilango@intel.com</a>><br><br><br><div dir="ltr">Thanks for the reply,<div><br></div><div>I run the fi_msg_bw with CQ size and window size of 1 and I got the following result:</div><div><p style="margin:0px;font-size:12px;line-height:normal;font-family:'andale mono';color:rgb(41,249,20);background-color:rgb(0,0,0)"><span>bytes   iters   total       time     MB/sec    usec/xfer   Mxfers/sec</span></p>
<p style="margin:0px;font-size:12px;line-height:normal;font-family:'andale mono';color:rgb(41,249,20);background-color:rgb(0,0,0)"><span>512k    1m      488g      189.50s   2766.76     189.50       0.01</span></p></div><div>21,61 Gbps is an excellent result. I'm using libfabric 1.3.0 from the latest tarball. </div><div><br></div><div>So the problem is in my transport layer.</div><div>My fabric initialisation doesn't differs too much from the fi_msg_bw one, so the problem might be in the main loop.</div><div>At a first glance, It seems that i call fi_cq_read less often than the bw test.</div><div>In the test the sequence is:</div><div>- post work request ft_post_tx/rx</div><div>- spin on a completion bw_(tx/rx)_comp</div><div><br></div><div>In my client/server main loop:</div><div>- call fi_cq_read</div><div>- post work request</div><div><br></div><div>I don't spin waiting for completions, could this be the reason?</div><div><br></div><div>Thanks,</div><div>Valentino</div><div><br></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On 7 September 2016 at 19:17, Ilango, Arun <span dir="ltr"><<a href="mailto:arun.ilango@intel.com" target="_blank">arun.ilango@intel.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">





<div lang="EN-US" link="#0563C1" vlink="#954F72">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Hi Valentino,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Libfabric has a set of tests available at <a href="https://github.com/ofiwg/fabtests" target="_blank">https://github.com/ofiwg/fabte<wbr>sts</a>. Can you run the fi_msg_bw test with the same size and iterations on your setup and
 check if you notice any variance? Also what version/commit number of libfabric are you using?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Thanks,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d">Arun.<u></u><u></u></span></p>
<p class="MsoNormal"><a name="m_6693636125052867833_m_639103972498807348__MailEndCompose"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1f497d"><u></u> <u></u></span></a></p>
<p class="MsoNormal"><a name="m_6693636125052867833_m_639103972498807348______replyseparator"></a><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Libfabric-users [mailto:<a href="mailto:libfabric-users-bounces@lists.openfabrics.org" target="_blank">libfabric-users-bounce<wbr>s@lists.openfabrics.org</a>]
<b>On Behalf Of </b>Valentino Picotti<br>
<b>Sent:</b> Wednesday, September 07, 2016 7:48 AM<br>
<b>To:</b> <a href="mailto:libfabric-users@lists.openfabrics.org" target="_blank">libfabric-users@lists.openfabr<wbr>ics.org</a><br>
<b>Subject:</b> [libfabric-users] Optimisation Tips for verbs provider<u></u><u></u></span></p><div><div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">Hi all,<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I apologies in advance for the long email.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">In the past month I've integrated libfabric in a project based on infiniband verbs with the aim to be provider independent. This project has a transport layer that makes the application independent from the transport implementation (that
 is chosen at compile time).<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">I worked only on the libfabric implementation of the transport layer and this was my first experience with RDMA APIs and hardware. What I did was to map the various ibv_* and rdma_* calls to fi_* calls and I got a working layer quite easily
 (after studying the libfabric terminology).<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Now I'm trying to achieve the same performance of raw verbs.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">I'm testing the transport layer with a one sided communication where a client sends the data to a server with the message API(fi_send/fi_recv). The client and the server run on two different nodes connected with one IB EDR link: i don't
 set processor affinity nor change power management policy. The depth of completion queues and the size of sent buffers are the same across the tests.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Running on the verbs transport layer I get a stable bandwidth of 22 Gbps, instead with libfabric over verbs I get a very floating bandwidth: from 0.4 Gbps to 19 Gbps in the same test[1]. The bandwidth is calculated as the number of buffers
 sent every 5 seconds.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">This is how i setup the verbs provider:<u></u><u></u></p>
</div>
<p class="MsoNormal"><br>
  m_hints->caps = FI_MSG;<br>
  m_hints->mode = FI_LOCAL_MR;<br>
  m_hints->ep_attr->type = FI_EP_MSG;<br>
  m_hints->domain_attr->threadin<wbr>g = FI_THREAD_COMPLETION;<br>
  m_hints->domain_attr->data_pro<wbr>gress = FI_PROGRESS_MANUAL;<br>
  m_hints->domain_attr->resource<wbr>_mgmt = FI_RM_DISABLED;<br>
  m_hints->fabric_attr->prov_nam<wbr>e = strdup("verbs");<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Furthermore I bind two completion queues to the endpoints: one with FI_SEND flag and the other with FI_RECV.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I can't figure out why I'm getting that high variance with libfabric.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Do you have any idea? I'm missing same optimisations tips for the verbs provider?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Thanks in advance,<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Valentino<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">[1] Test run with depth queue of 1 and buffer size of 512KB <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Example of a test output with libfarbic:<u></u><u></u></p>
</div>
<div>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:10:56 t_server: INFO: Accepted connection<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:10:56 t_server: INFO: Start receiving...<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:01 t_server: INFO: Bandwith: 8.3324 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:06 t_server: INFO: Bandwith: 15.831 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:11 t_server: INFO: Bandwith: 19.1713 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:16 t_server: INFO: Bandwith: 10.8825 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:21 t_server: INFO: Bandwith: 8.07991 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:26 t_server: INFO: Bandwith: 15.4015 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:31 t_server: INFO: Bandwith: 20.4263 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:36 t_server: INFO: Bandwith: 19.7023 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:41 t_server: INFO: Bandwith: 10.474 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:46 t_server: INFO: Bandwith: 17.4072 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:51 t_server: INFO: Bandwith: 0.440402 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:11:56 t_server: INFO: Bandwith: 2.73217 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:12:01 t_server: INFO: Bandwith: 0.984822 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:12:06 t_server: INFO: Bandwith: 2.93013 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:12:11 t_server: INFO: Bandwith: 0.847248 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:12:16 t_server: INFO: Bandwith: 7.72255 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:12:21 t_server: INFO: Bandwith: 14.7849 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:12:26 t_server: INFO: Bandwith: 12.9243 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:12:31 t_server: INFO: Bandwith: 0.687027 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:12:36 t_server: INFO: Bandwith: 1.44787 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 15:12:41 t_server: INFO: Bandwith: 2.681 Gb/s<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Example of a test output with raw verbs:<u></u><u></u></p>
</div>
<div>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:00 t_server: INFO: Accepted connection<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:00 t_server: INFO: Start receiving...<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:05 t_server: INFO: Bandwith: 17.9491 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:10 t_server: INFO: Bandwith: 23.4671 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:15 t_server: INFO: Bandwith: 23.0368 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:20 t_server: INFO: Bandwith: 22.9638 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:25 t_server: INFO: Bandwith: 22.8203 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:30 t_server: INFO: Bandwith: 20.058 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:35 t_server: INFO: Bandwith: 22.5033 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:40 t_server: INFO: Bandwith: 20.1754 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:45 t_server: INFO: Bandwith: 22.5578 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:50 t_server: INFO: Bandwith: 20.0588 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:36:55 t_server: INFO: Bandwith: 22.2718 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:00 t_server: INFO: Bandwith: 22.494 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:05 t_server: INFO: Bandwith: 23.1836 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:10 t_server: INFO: Bandwith: 23.0972 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:15 t_server: INFO: Bandwith: 21.5033 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:20 t_server: INFO: Bandwith: 18.5506 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:25 t_server: INFO: Bandwith: 20.3709 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:30 t_server: INFO: Bandwith: 21.3457 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:35 t_server: INFO: Bandwith: 20.5059 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:40 t_server: INFO: Bandwith: 22.4899 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:45 t_server: INFO: Bandwith: 22.1266 Gb/s<u></u><u></u></span></p>
<p style="margin:0in;margin-bottom:.0001pt;background:black"><span style="font-size:9.0pt;font-family:"andale mono",serif;color:#29f914">2016-09-07 - 16:37:50 t_server: INFO: Bandwith: 22.4504 Gb/s<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
</div></div></div>
</div>

</blockquote></div><br></div>
</div></div></div><br></div></div>