<br><br>
<div><span class="gmail_quote">On 7/24/07, <b class="gmail_sendername">Eitan Zahavi</b> <<a href="mailto:eitan@mellanox.co.il">eitan@mellanox.co.il</a>> wrote:</span>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div>
<div><span><font face="Palatino Linotype" color="#0000ff"><strong>Hi Hal,</strong></font></span></div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff"></font></strong></span> </div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">The code to find "duplicated" GUIDs stem from real user cases where flawed </font></strong></span></div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">burning procedure caused actual GUID duplications. There is nothing "impossible". </font></strong></span></div></div></blockquote>
<div> </div>
<div>No one said impossible; just a violation of what globally unique (GU from GUID) really means. It's largely because vendors allowed users to program non volatile RAM for GUIDs rather than a real manufacturing process for this which guarantees uniqueness that we are even discussing this aspect of it.
</div><br>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div>
<div><span><font face="Palatino Linotype" color="#0000ff"><strong>So it is really critical the the SM will be able to recognize this case and abort.</strong></font></span></div></div></blockquote>
<div> </div>
<div>I agree with the detect part but not the abort part. Why can't it report these errors and continue on ? That seems better to me than aborting.</div>
<div> </div>
<div>-- Hal</div><br>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div>
<div> </div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">It might be that for testing someone wants to use a loopback plug that cause the same </font></strong></span></div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">port GUID appear on both sides of link - but it is better to require the user doing the test </font></strong></span></div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">to set some flag than to miss such a situation in real life cluster.</font></strong></span></div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff"></font></strong></span> </div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">This requirement was written after many people wasted many hours trying to figure out what was going on.</font></strong></span></div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">PLEASE DO NOT TAKE IT AWAY</font></strong></span></div><span class="q">
<div><span><strong><font face="Palatino Linotype" color="#0000ff"></font></strong></span> </div>
<p><span lang="en-gb"><b><i><font face="Monotype Corsiva" color="#0000ff" size="6">Eitan Zahavi</font></i></b><i></i></span> <br><span lang="en-gb"><font face="Tahoma" size="2">Senior Engineering Director, Software Architect
</font></span> <br><span lang="en-gb"><font face="Tahoma" size="2">Mellanox Technologies LTD</font></span> <br><span lang="en-gb"><font face="Tahoma" size="2">Tel:+972-4-9097208<br>Fax:+972-4-9593245</font></span> <br><span lang="en-gb">
<font face="Tahoma" size="2">P.O. Box 586 Yokneam 20692 ISRAEL</font></span> </p>
<div> </div><br></span>
<blockquote style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<div lang="en-us" dir="ltr" align="left">
<hr>
<font face="Tahoma" size="2"><span class="q"><b>From:</b> Hal Rosenstock [mailto:<a onclick="return top.js.OpenExtLink(window,event,this)" href="mailto:hal.rosenstock@gmail.com" target="_blank">hal.rosenstock@gmail.com</a>
] <br></span><b>Sent:</b> Tuesday, July 24, 2007 6:04 PM
<div><span class="e" id="q_113f97055d6cd0c7_5"><br><b>To:</b> Eitan Zahavi<br><b>Cc:</b> OpenFabrics General; Sasha Khapyorsky; Yevgeny Kliteynik<br><b>Subject:</b> Re: OpenSM detection of duplicated GUIDs on loopback<br>
</span></div></font><br> </div>
<div><span class="e" id="q_113f97055d6cd0c7_7">
<div></div><br><br>
<div><span class="gmail_quote">On 7/24/07, <b class="gmail_sendername">Eitan Zahavi</b> <<a onclick="return top.js.OpenExtLink(window,event,this)" href="mailto:eitan@mellanox.co.il" target="_blank">eitan@mellanox.co.il
</a>> wrote:</span>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div>
<div><font face="Tahoma" size="2"><span><b>From:</b> Hal Rosenstock [mailto:<a onclick="return top.js.OpenExtLink(window,event,this)" href="mailto:hal.rosenstock@gmail.com" target="_blank">hal.rosenstock@gmail.com </a>]
<br></span><b>Sent:</b> Tuesday, July 24, 2007 5:53 PM<br><b>To:</b> Eitan Zahavi<br><b>Cc:</b> OpenFabrics General; Sasha Khapyorsky; Yevgeny Kliteynik<br><b>Subject:</b> Re: OpenSM detection of duplicated GUIDs on loopback
<br></font><br> </div>
<blockquote style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<div></div>Hi Eitan,<br><br>
<div><span><span class="gmail_quote">On 7/24/07, <b class="gmail_sendername">Eitan Zahavi</b> <<a onclick="return top.js.OpenExtLink(window,event,this)" href="mailto:eitan@mellanox.co.il" target="_blank">eitan@mellanox.co.il
</a>> wrote:</span>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div>
<div><span><font face="Palatino Linotype" color="#0000ff"><strong>Hi Hal,</strong></font></span></div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff"></font></strong></span> </div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">What is this "loopback" connector used for?</font></strong></span></div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">Does not seem to me like a very useful thing to do.</font></strong></span></div></div></blockquote>
<div><strong><font face="Palatino Linotype" color="#0000ff"></font></strong> </div>
<div>Perhaps not but no reason OpenSM can't handle this more gracefully.</div><br>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div>
<div><span><strong><font face="Palatino Linotype" color="#0000ff">Anyway, if it is not a production environment we could add a "debug mode" (-d flag option) to ignore this check.</font></strong></span></div></div>
</blockquote>
<div><strong><font face="Palatino Linotype" color="#0000ff"></font></strong> </div></span>
<div><span>Why would a separate flag be needed ?<br></span><span><strong><font face="Palatino Linotype" color="#0000ff">[EZ] Since I do not see any other solution for the SM to know it is really a loop back plug rather then two devices with same GUID connected back to back ...
</font></strong></span></div></div></blockquote></div></blockquote>
<div>
<div> </div>
<div>"Technically", this should only occur when looped back and not two devices with same GUID as GUID == globally unique and a duplication indicates a "manufacturing" issue.</div>
<div> </div></div>
<div>Anyhow, can't these be treated the same (and handled more gracefully) without an additional option/flag ?</div>
<div> </div>
<div>-- Hal</div><br>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div>
<blockquote style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<div>
<div><span>
<div> </div>
<div>-- Hal</div><br>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">
<div>
<div><strong><font face="Palatino Linotype" color="#0000ff"></font></strong> </div>
<p><span lang="en-gb"><b><i><font face="Monotype Corsiva" color="#0000ff" size="6">Eitan Zahavi</font></i></b><i></i></span> <br><span lang="en-gb"><font face="Tahoma" size="2">Senior Engineering Director, Software Architect
</font></span><br><span lang="en-gb"><font face="Tahoma" size="2">Mellanox Technologies LTD</font></span> <br><span lang="en-gb"><font face="Tahoma" size="2">Tel:+972-4-9097208<br>Fax:+972-4-9593245</font></span> <br><span lang="en-gb">
<font face="Tahoma" size="2">P.O. Box 586 Yokneam 20692 ISRAEL</font></span> </p>
<div> </div><br>
<blockquote dir="ltr" style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<div lang="en-us" dir="ltr" align="left">
<hr>
<font face="Tahoma" size="2"><b>From:</b> Hal Rosenstock [mailto:<a onclick="return top.js.OpenExtLink(window,event,this)" href="mailto:hal.rosenstock@gmail.com" target="_blank">hal.rosenstock@gmail.com</a>] <br><b>Sent:
</b>Tuesday, July 24, 2007 5:31 PM<br><b>To:</b> OpenFabrics General<br><b>Cc:</b> Sasha Khapyorsky; Eitan Zahavi; Yevgeny Kliteynik<span><br><b>Subject:</b> OpenSM detection of duplicated GUIDs on loopback<br></span></font>
<br> </div>
<div><span>
<div></div>
<div>Hi,</div>
<div> </div>
<div>This is what starts off as a "minor" issue and I know it has been discussed it somewhat in the past: </div>
<div> </div>
<div>Putting a loopback connector on a (switch) link causes OpenSM to indicate duplicated GUID error 0D18 as follows:<br><br>__osm_ni_rcv_set_links<br>{<br>...<br> /*<br> When there are only two nodes with exact same guids (connected back
<br> to back) - the previous check for duplicated guid will not catch<br> them. But the link will be from the port to itself...<br> Enhanced Port 0 is an exception to this<br> */
<br> if ((osm_node_get_node_guid( p_node ) == p_ni_context->node_guid) &&<br> (port_num == p_ni_context->port_num) &&<br> (port_num != 0))<br> {<br> osm_log( p_rcv->p_log, OSM_LOG_ERROR,
<br> "__osm_ni_rcv_set_links: ERR 0D18: "<br> "Duplicate GUID found by link from a port to itself:"<br> "node 0x%" PRIx64 ", port number 0x%X\n",
<br> cl_ntoh64( osm_node_get_node_guid( p_node ) ),<br> port_num );<br>...<br><br>So this occurs over and over and over and fills the log with the same spew. This should be improved IMO.
<br><br>Is this really a fatal condition ? Doesn't seem like it should be to me. </div>
<div> </div>
<div>Also, OpenSM can "ride" this out with -y (stay on fatal) but is that safe for this condition ?</div>
<div> </div>
<div>Seems like something like an extra loopback bit should be added to some port structure which should cause these links to be ignored. This bit would then be reset when the peer is now longer itself. <br><br>Also, is there a relationship of this with the 12x/duplicated GUID code ?
</div>
<div> </div>
<div>Thanks.</div>
<div> </div>
<div>-- Hal<span></span></div></span></div></blockquote></div></blockquote></span></div></div><br></blockquote></div></blockquote></div><br></span></div></blockquote></div></blockquote></div><br>