[Users] OpenSM error message rosetta stone?
Ira Weiny
weiny2 at llnl.gov
Tue Feb 19 14:17:25 PST 2013
On Tue, 19 Feb 2013 14:38:47 -0600
Narayan Desai <narayan.desai at gmail.com> wrote:
> On Tue, Feb 19, 2013 at 1:03 PM, Ira Weiny <weiny2 at llnl.gov> wrote:
>
> >> It looks like some lines are being mixed; is this just a lack of a
> >> newline, or are the messages interspersed?
> >
> > Yes there is a bug here. I submitted a patch but it was rejected because the newline was added as part of another patch. So, I believe this is fixed in 3.3.16.
>
> This is just cosmetic, right?
yes.
Ira
>
> >>
> >> Does the initial path information identify the remote node having
> >> troubles? How can I turn that into usable coordinates?
> >
> > The DR path in this case is the node which the SM _can_ talk to (0,1,19,13 guid 0x0002c902004158b0). The remote node which is not responding is on port 6 of that node. Whatever is connected to port 6 is the problem node.
> >
> > The easiest way to trace this using the diags would be:
> >
> > iblinkinfo -D 0,1,19,13
> > or
> > iblinkinfo -G 0x0002c902004158b0
> >
> > It too will fail to query port 6 but it should give you a better idea of where in the fabric you are by looking at the other nodes connected to other ports...
>
> Thanks.
> -nld
--
Ira Weiny
Member of Technical Staff
Lawrence Livermore National Lab
925-423-8008
weiny2 at llnl.gov
More information about the Users
mailing list