[Users] interpreting ibdiagnet output
Ira Weiny
weiny2 at llnl.gov
Mon Sep 17 13:02:56 PDT 2012
On Mon, 17 Sep 2012 12:48:18 -0500
Narayan Desai <narayan.desai at gmail.com> wrote:
> Is there a canonical place that describes the errors reflected in
> ibdiagnet output, and potential resolutions? I'm trying to fix up a
> qdr fabric, and am seeing a combination of:
> - symbol errors (these are pretty clear; i'm assuming that cable
> replacement is the solution in a lot of these cases)
Fix these first.
> - ipoib speed errors (the group is at 10 gbit, but the network is
> capable of 40; is this a cosmetic error, or will it cause actual
> performance problems?)
This is likely to be due to the above errors causing some ports to be slower (10gbit) and they joined/created the mcast group first. Thus the whole group is slower than capable.
I think these will clean up once you fix the errors above.
Not sure how ibdiagnet reports these errors but to check for slow ports you could do the following:
iblinkinfo -l | grep -i could
All ports which "could be" faster (either link width or speed) will be listed.
Ira
> - pages of multicast errors (not really sure what to do here)
> - nagging issues with arping (which i suspect might have to do with
> the previous multicast errors)
>
> Various vendor docs describe how to run ibdiagnet, but don't do any
> more than reproduce some output. I'm happy to RTFM, but a fair big of
> googling has not revealed with FM to R. ;)
> -nld
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users
--
Ira Weiny
Member of Technical Staff
Lawrence Livermore National Lab
925-423-8008
weiny2 at llnl.gov
More information about the Users
mailing list