[Users] OpenSM error message rosetta stone?

Narayan Desai narayan.desai at gmail.com
Tue Feb 19 12:38:47 PST 2013


On Tue, Feb 19, 2013 at 1:03 PM, Ira Weiny <weiny2 at llnl.gov> wrote:

>> It looks like some lines are being mixed; is this just a lack of a
>> newline, or are the messages interspersed?
>
> Yes there is a bug here.  I submitted a patch but it was rejected because the newline was added as part of another patch.  So, I believe this is fixed in 3.3.16.

This is just cosmetic, right?

>>
>> Does the initial path information identify the remote node having
>> troubles? How can I turn that into usable coordinates?
>
> The DR path in this case is the node which the SM _can_ talk to (0,1,19,13 guid 0x0002c902004158b0).  The remote node which is not responding is on port 6 of that node.  Whatever is connected to port 6 is the problem node.
>
> The easiest way to trace this using the diags would be:
>
> iblinkinfo -D 0,1,19,13
> or
> iblinkinfo -G 0x0002c902004158b0
>
> It too will fail to query port 6 but it should give you a better idea of where in the fabric you are by looking at the other nodes connected to other ports...

Thanks.
 -nld



More information about the Users mailing list