[ofa-general] RE: [Bug 465] IPoIB CM HA fails after several hoursof failures
Hal Rosenstock
halr at voltaire.com
Thu Mar 29 04:34:54 PDT 2007
On Thu, 2007-03-29 at 02:09, Philippe.GREGOIRE at CEA.FR wrote:
> Michael
> tracing route between HCA port and the subnet manager will give the
> lid of the switch connected to this HCA port :
>
> [root at cors127 ~]# ibstat
> CA 'mthca0'
> CA type: MT23108
> Number of ports: 2
> Firmware version: 3.0.0
> Hardware version: a1
> Node GUID: 0x0008f10403962eb0
> System image GUID: 0x0008f10403962eb3
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 10
> Base lid: 26
> LMC: 1
> SM lid: 14
> Capability mask: 0x00110a68
> Port GUID: 0x0008f10403962eb1
> Port 2:
> State: Down
> Physical state: Polling
> Rate: 2
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x00110a68
> Port GUID: 0x0008f10403962eb2
> [root at cors127 ~]# ibtracert 26 14
> >From ca {0x0008f10403962eb0} portnum 1 lid 0x1a-0x1b "cors127 HCA-1"
> [1] -> switch port {0x0005ad000001a775}[2] lid 0x2-0x2 "Cisco Switch
> SFS7000"
> [24] -> switch port {0x0005ad0000001834}[5] lid 0x10-0x10 "Topspin
> Switch - U3"
> [3] -> switch port {0x0005ad0000001830}[1] lid 0xe-0xe "Topspin Switch
> - U1"
> To switch {0x0005ad0000001830} portnum 0 lid 0xe-0xe "Topspin Switch -
> U1"
> [root at cors127 ~]# ibtracert 26 14 2>&1 | awk '(NR==2) {print $7}'
> 0x2-0x2
>
> HCA port lid and its subnet manager lid are available in
> /sys/infiniband, so
> it 's better to do :
>
> [root at cors127 ~]# ibtracert
> $(</sys/class/infiniband/mthca0/ports/1/lid)
> $(</sys/class/infiniband/mthca0/ports/1/sm_lid) 2>&1 | awk '(NR==2)
> {sub(/-.*/, "", $7); print $7}'
> 0x2
>
> PS: redirection of stderr to stdout is required as ibtracert gives all
> info on stderr.
This was fixed recently so it depends on the version being used.
-- Hal
> Philippe
> -------- Message d'origine--------
> De: general-bounces at lists.openfabrics.org de la part de Michael S.
> Tsirkin
> Date: mer. 28/03/2007 22:12
> À: Hal Rosenstock
> Cc: Michael S. Tsirkin; general at lists.openfabrics.org;
> bugmail at lists.openfabrics.org
> Objet : Re: [ofa-general] RE: [Bug 465] IPoIB CM HA fails after
> several hoursof failures
>
> > > > Not true; ibportstate can do this.
> > >
> > > I found that, yes.
> > > However, to automate this fully I need to find the lid
> > > of the switch that is connected to specific HCA ports.
> >
> > So do you have the GUID or LID or the HCA port(s) in question ?
>
> Yes, that's easy to get.
>
> > > I expect ibnetdiscover can do this, but was unable to grok
> > > the output syntax.
> >
> > I'll explain once I have the answer to the above question.
> >
> > > Is it documented somewhere?
> >
> > In the man page but this may not be sufficient for your purposes.
> >
> > > Alternatively, can linkinfo be queried with saquery?
> >
> > Not currently.
>
>
>
> --
> MST
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
>
>
More information about the general
mailing list