[openib-general] opensm fails to bring up subnet..

Eitan Zahavi eitan at mellanox.co.il
Fri Jun 3 09:47:38 PDT 2005


Hi, 
Sorry for catching up with this late in the thread. (Thanks Hal for waking
me up...)
> 
> It appears that a node is not responding to a discovery packet (SM Get
> NodeInfo (attrID 0x11)). It's direct route initial path (an array of
> port numbers at the start of the next hop) is:
> Initial path = [1][81][1] which means that starting at the node running
> OpenSM, port 1 then port 129 then port 1. Is there a large switch in the
> middle ? Can you send the output of ibnetdiscover ? If that is valid,
> which HCA (port) is not responding (what is the GUID) ?
[EZ] Normally all directed route dumps should start with: 
Initial path = [0][....
The first hop is reserved to 0 - so I wonde if the above text is a direct
quote from the osm.log ?
The fact you got there a [81] means that the packet should leave from port
81 ?? 
I have never seen a switch with more then 24 ports...

> Unfortunately on such an error osm does not appear to give up  (it
> retries forever and is locked on such a node). This is obviously not
> good.
Also Troy if you are able to capture the entire log it might put some light
on the issue of "OpenSM never give up" on such cases - which we want to
resolve.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050603/8af8ad07/attachment.html>


More information about the general mailing list