[ofa-general] ibnetdiscover

Hal Rosenstock hal.rosenstock at gmail.com
Wed Aug 8 04:50:36 PDT 2007


On 8/8/07, Bernd Schubert <bs at q-leap.de> wrote:
> Hello Hal,
>
> thanks for your help.
>
> On Tuesday 07 August 2007 14:36:09 Hal Rosenstock wrote:
> > On 8/7/07, Bernd Schubert <bs at q-leap.de> wrote:
> > > Hi,
> > >
> > > I two questions about ibnetdiscover.
> > >
> > > 1) How reliable is it? Here in our testing lab ibnetdiscover works fine,
> > > detects the proper names of the two infiniband switches and all connected
> > > client cards. On a customer system it doesn't work that well. I know the
> > > client is connected to an MTS2400 switch, but ibnetdiscover detects a
> > > MT47396. The MTS2400 is connected to MTS1400 switch, but again
> > > ibnetdiscover believs it is again a MT47396. Any idea whats going on?
> >
> > Where is MT47396 being displayed ? ibnetdiscover displays the
> > NodeDescription and perhaps that is not set properly on those
> > switches. You can verify this with smpquery nodedesc.
>
> ibwarn: [30449] handle_port: NodeInfo on DR path 0,1,12,21,16,1 failed, skipping port
> #
> # Topology file: generated on Tue Aug  7 13:51:41 2007
> #
> # Max of 5 hops discovered
> # Initiated from node 0002c90200401338 port 0002c90200401339
>
> vendid=0x2c9
> devid=0xb924
> sysimgguid=0xb8cffff0024ef
> switchguid=0xb8cffff0024ef
> Switch  24 "S-000b8cffff0024ef"         # "MT47396 Infiniscale-III Mellanox Technologies" base port 0 lid 154 lmc 0
> [24]    "S-0002c9010befe970"[2]         # "MT47396 Infiniscale-III Mellanox Technologies" lid 138
> [23]    "S-0002c9010befe970"[1]         # "MT47396 Infiniscale-III Mellanox Technologies" lid 138
> [22]    "H-0002c902004013c0"[1]         # "MT23108 InfiniHost Mellanox Technologies" lid 4
> [...]
>
>
> ha-beo-2:/tmp/ofed/leuven# smpquery -vvvv -G switchinfo 0xb8cffff0024ef
> # Switch info: Lid 154
> LinearFdbCap:....................49152
> RandomFdbCap:....................0
> McastFdbCap:.....................1024
> LinearFdbTop:....................321
> DefPort:.........................0
> DefMcastPrimPort:................0
> DefMcastNotPrimPort:.............0
> LifeTime:........................18
> StateChange:.....................0
> LidsPerPort:.....................0
> PartEnforceCap:..................32
> InboundPartEnf:..................1
> OutboundPartEnf:.................1
> FilterRawInbound:................1
> FilterRawInbound:................1
> EnhancedPort0:...................0
>
>
> Sorry, but I have no idea how the output of smpquery should help me.

I was asking for smpquery nodedesc rather than switchinfo but I can
see from the ibnetdiscover output what the answer is. It is what I
originally said: that the NodeDescriptions of both those switches are
the same.

>
> >
> > > 2) ibnetdiscover also can't detect everything on the very same customer
> > > system, it shows this error
> > >
> > > ibwarn: [30449] handle_port: NodeInfo on DR path 0,1,12,21,16,1 failed,
> > > skipping port
> > >
> > > Does this mean port 1 of the last switch failed?
> >
> > It means that the peer port of port 1's SMA failed to respond to the
> > SubnGet NodeInfo. What is connected there and what state is it in ?
>
> Unfortunately I have no idea. The system is not located near to us and its
> rather difficult to ask our customer, since ibnetdiscover shows the wrong
> switch names. I will try anyway later on this day.

Names are not the only alternative. There are several ways to deal with this.

> >
> > > I'm also not sure about the pathes, IMHO the man page of ibnetdiscover
> > > should give one more example, so
> > >
> > >
> > >
> > >       -D      use directed path address arguments. The path
> > >               is a comma separated list of out ports.
> > >               Examples:
> > >               "0"             # self port
> > >               "0,1,2,1,4"     # out via port 1, then 2, ...
> > >
> > >
> > > "out via port 1, then out via port 2, then out via port 1, ..."
> > >
> > > or
> > >
> > > "out via port 1, then in port 2, then out via port 1, ..."
> > >
> > >
> > > You see what I mean?
> >
> > It's the former. It's the out port on each hop along the path.
>
> Thanks, do you mind to apply this patch?

I am no longer the maintainer for this. Sasha ?

-- Hal

> I know the text already says
> "separated list of out ports", but for those who don't believe it
> like me ;) the comment will make it a bit more convincing.
>
>
> --- ./ofa_user-1.2.orig/src/userspace/management/diags/man/ibnetdiscover.8      2007-06-21 16:39:17.000000000 +0200
> +++ ./ofa_user-1.2/src/userspace/management/diags/man/ibnetdiscover.8   2007-08-08 11:15:33.000000000 +0200
> @@ -66,7 +66,7 @@
>         is a comma separated list of out ports.
>         Examples:
>         "0"             # self port
> -        "0,1,2,1,4"     # out via port 1, then 2, ...
> +        "0,1,2,1,4"     # out via port 1, then out via port 2, ...
>  .PP
>  \-G      use GUID address argument. In most cases, it is the Port GUID.
>         Example:
>
>
> Thanks again,
> Bernd
>
>
> --
> Bernd Schubert
> Q-Leap Networks GmbH
>



More information about the general mailing list