<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Courier New";}
span.EmailStyle20
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle21
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:#1F497D;}
span.EmailStyle22
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt">Yes, sockets provider. Unfortch, I’m in the middle of a lot of code updates, it will be a while before I can finish what I am doing and get back to trying a different provider.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Kevan<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-left:.5in"><b><span style="color:black">From: </span>
</b><span style="color:black">"Hefty, Sean" <sean.hefty@intel.com><br>
<b>Date: </b>Friday, April 19, 2019 at 10:39 AM<br>
<b>To: </b>Kevan Rehm <krehm@cray.com>, "libfabric-users@lists.openfabrics.org" <libfabric-users@lists.openfabrics.org><br>
<b>Subject: </b>RE: trouble by FI_SOURCE use - revisited<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
</div>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;color:#1F497D">Is this running over the ‘sockets’ provider? If so, can you test with the ‘tcp’ provider instead?</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;color:#1F497D"> </span><o:p></o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-left:.5in"><a name="_____replyseparator"></a><b><span style="font-size:11.0pt">From:</span></b><span style="font-size:11.0pt"> Libfabric-users [mailto:libfabric-users-bounces@lists.openfabrics.org]
<b>On Behalf Of </b>Kevan Rehm<br>
<b>Sent:</b> Friday, April 19, 2019 7:49 AM<br>
<b>To:</b> libfabric-users@lists.openfabrics.org<br>
<b>Subject:</b> [libfabric-users] trouble by FI_SOURCE use - revisited</span><o:p></o:p></p>
</div>
</div>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">Greetings,</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">I noticed this recent message from someone having the same problem as I am with the sockets provider.</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:10.0pt;font-family:"Courier New";color:black">I believe I understand the problem now.</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:10.0pt;font-family:"Courier New";color:black"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:10.0pt;font-family:"Courier New";color:black">My original implementation was correct, but when the first message arrives, the receiving node does not have it's address in the address vector
yet and so it reports FI_ADDR_NOTAVAIL.</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:10.0pt;font-family:"Courier New";color:black">I'm converting our bootstrap routine that used PMI on cray to work with sockets on other machines. I shall send an extra message first with only
the socket information needed to the root node and it can then insert the correct address into the av, then things can resume as before (hopefully).</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:10.0pt;font-family:"Courier New";color:black"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:10.0pt;font-family:"Courier New";color:black">JB</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">Having tried what JB suggests above using the v1.7.x branch of libfabric, I can report that it doesn’t work. I tried sending a client “hello” libfabric message containing the client’s
name together with its raw address to my server. The server adds the client’s raw address to its address vector, then creates a hash table entry where the key is the client’s fi_addr_t and the value is a struct identifying the client. I use FI_SOURCE so
that I can use the fi_addr_t from fi_cq_readfrom() to identify which client sends each message. What happens is that every subsequent message from that client still reports FI_ADDR_NOTAVAIL, even though the client’s address is now in the server’s address
vector. Once a client sends a message on an endpoint, there is nothing in fi_av_insert() that goes back and updates existing client structs when a new address is inserted in the address vector.</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">This seems like a bug to me. The person above had to previously implement PMI code to pass the client’s address to the server out-of-band for insertion into the server’s address
vector before the client could initiate any libfabric communication. The server had to be re-coded with PMI to receive and record these addresses. This is not always possible.</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">In my case I have servers using libfabric that run persistently on Cray service nodes. Client jobs start periodically and read a JSON file to get the addresses of those servers,
together with a DRC token that gives them access to the servers’ communication domain. Clients need to pass their names together with their raw addresses to the servers for insertion into the server address vectors so that FI_SOURCE works. The servers are
not part of any job, there is no PMI, MPI, or any other communication method available between clients and servers other than libfabric.</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">What I had to do as a workaround was to create a second endpoint in every client process, used only to send a “hello” message with the client’s name and the raw address of its main
endpoint to the servers, so that those addresses were then already inserted in server address vectors when the first real message was sent on the main endpoint. This seems overly complex to me, just to get FI_SOURCE working for a client.</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">Here’s a thought that I had. Whenever the server calls fi_av_insert() or fi_av_insertsvc(), sockets code would walk through all existing client structs, and for each struct containing
the address FI_ADDR_NOTAVAIL, set a bit that says “the address vector has been updated, maybe this client’s fi_addr_t is available now”. The next time that a message arrives from the client, this flag is checked. If set, the code searches the address
vector for a match, one-time. If a match is found, the fi_addr_t value replaces FI_ADDR_NOTAVAIL. If no match is found, the new flag bit is cleared, so that a search does not have to be redone for every new message from that client.</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">I haven’t tried this, but I suspect that if I call fi_av_remove() to remove an address from the server’s address vector, that there is no code that walks the client structs, finds
matches on the removed fi_addr_t, and sets the client’s address back to FI_ADDR_NOTAVAIL. True? If so, a similar loop as the above could take care of this problem as well.</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">Thoughts?</span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt">Thanks, Kevan</span><o:p></o:p></p>
</div>
</div>
</body>
</html>