[ofa-general] RE: OpenSM detection of duplicated GUIDs on loopback

Eitan Zahavi eitan at mellanox.co.il
Tue Jul 24 07:44:22 PDT 2007


Hi Hal,
 
What is this "loopback" connector used for?
Does not seem to me like a very useful thing to do.
Anyway, if it is not a production environment we could add a "debug
mode" (-d flag option) to ignore this check.
 

Eitan Zahavi 
Senior Engineering Director, Software Architect 
Mellanox Technologies LTD 
Tel:+972-4-9097208
Fax:+972-4-9593245 
P.O. Box 586 Yokneam 20692 ISRAEL 

 


________________________________

	From: Hal Rosenstock [mailto:hal.rosenstock at gmail.com] 
	Sent: Tuesday, July 24, 2007 5:31 PM
	To: OpenFabrics General
	Cc: Sasha Khapyorsky; Eitan Zahavi; Yevgeny Kliteynik
	Subject: OpenSM detection of duplicated GUIDs on loopback
	
	
	Hi,
	 
	This is what starts off as a "minor" issue and I know it has
been discussed it somewhat in the past: 
	 
	Putting a loopback connector on a (switch) link causes OpenSM to
indicate duplicated GUID error 0D18 as follows:
	
	__osm_ni_rcv_set_links
	{
	...
	          /*
	             When there are only two nodes with exact same guids
(connected back 
	             to back) - the previous check for duplicated guid
will not catch
	             them. But the link will be from the port to
itself...
	             Enhanced Port 0 is an exception to this
	          */ 
	          if ((osm_node_get_node_guid( p_node ) ==
p_ni_context->node_guid) &&
	              (port_num == p_ni_context->port_num) &&
	              (port_num != 0))
	          {
	            osm_log( p_rcv->p_log, OSM_LOG_ERROR, 
	                     "__osm_ni_rcv_set_links: ERR 0D18: "
	                     "Duplicate GUID found by link from a port
to itself:"
	                     "node 0x%" PRIx64 ", port number 0x%X\n", 
	                     cl_ntoh64( osm_node_get_node_guid( p_node )
),
	                     port_num );
	...
	
	So this occurs over and over and over and fills the log with the
same spew. This should be improved IMO. 
	
	Is this really a fatal condition ? Doesn't seem like it should
be to me. 
	 
	Also, OpenSM can "ride" this out with -y (stay on fatal) but is
that safe for this condition ?
	 
	Seems like something like an extra loopback bit should be added
to some port structure which should cause these links to be ignored.
This bit would then be reset when the peer is now longer itself. 
	
	Also, is there a relationship of this with the 12x/duplicated
GUID code ? 
	 
	Thanks.
	 
	-- Hal

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070724/6862e638/attachment.html>


More information about the general mailing list