***SPAM*** Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch

Ira Weiny weiny2 at llnl.gov
Tue Apr 21 11:27:17 PDT 2009


Hi Yicheng,

On Fri, 17 Apr 2009 15:07:54 -0500
Yicheng Jia <YJia at tmriusa.com> wrote:

> Hi Ira,
> 
> >Yea that is going to be a problem.  The problem is that effectively you 
> just
> > disabled the connection to the switch.  A reset disables then enables 
> the
> > port.  Once the port is disabled the command can't talk to the switch 
> any
> > longer.  You will have to either reset the switch (power cycle) or go to
> > another node and enable the port.  From the output you sent me it looks 
> like
> > you don't have any other nodes on the switch, so I take it you are 
> resetting
> > the switch to get the link to come back?
> 
> Thanks your explanation now everything is clear. Can I do "reset" by 
> down/enable instead of down/disable/enable so that I can reset the peer 
> port on the switch?

Not that I know of.

> 
> > I thought there was a warning in the man page or in the help regarding 
> this
> > situation but I don't see it now.
> 
> There's a warning if I try to reset a port which is not on the switch.
> 
> > BTW, What are you trying to achieve with this command?
> 
> Sometime there's 1x link on the subnet after reboot our system, which 
> consists of several HCA nodes directly connected with the switch. By 
> reboot, I mean restart each node. I am trying to achieve 4x link width by 
> using this command on 1x link port. Do you have any better idea of 
> resolving this problem?

The way we do it around here is to go to another node and issue the request.
I have a perl script in my "pragmatic infiniband utilities" (ibbouncelinks.pl)
which will skip the port it is running on.  That tarball can be found here:

https://computing.llnl.gov/linux/piu.html

I did not code an option to look for 1X links but I think it would be simple
to do so.

Thanks,
Ira

> 
> Thanks!
> Yicheng Jia
> 
> 
> 
> 
> 
> Ira Weiny <weiny2 at llnl.gov> 
> 04/17/2009 12:45 PM
> 
> To
> Yicheng Jia <YJia at tmriusa.com>
> cc
> general at lists.openfabrics.org, Hal Rosenstock <hal.rosenstock at gmail.com>
> Subject
> Re: ***SPAM*** Re: [ofa-general] link width problem of Qlogic 9024 
> unmanaged switch
> 
> 
> 
> 
> 
> 
> On Fri, 17 Apr 2009 12:06:43 -0500
> Yicheng Jia <YJia at tmriusa.com> wrote:
> 
> > Hi Ira,
> > 
> > Here is the output of "iblinkinfo.pl -R":
> > 
> > ++++++++++++++++++++++++++++++++++++++++++++++++
> > [root at ib_manager ~]# iblinkinfo.pl -R
> > Switch 0x00066a00d90009c1 InfiniCon System InfinIO 9024 Lite:
> >       7    1[  ]  ==( 4X 2.5 Gbps Active /   LinkUp)==>       6    1[  ] 
> "MT2520 4 InfiniHostLx Mellanox Technologies" (  )
> >            2[  ]  ==( 4X 2.5 Gbps   Down /  Polling)==>             [  ] 
> "" (  )
> >            3[  ]  ==( 4X 2.5 Gbps   Down /  Polling)==>             [  ] 
> "" (  )
> >            4[  ]  ==( 4X 2.5 Gbps   Down /  Polling)==>             [  ] 
> "" (  )
> 
> [snip]
> 
> > 
> > And the "ibstat" output:
> > +++++++++++++++++++++++++++++++++++
> > [root at ib_manager ~]# ibstat
> > CA 'mthca0'
> >         CA type: MT25204
> >         Number of ports: 1
> >         Firmware version: 1.2.0
> >         Hardware version: a0
> >         Node GUID: 0x0002c90200230784
> >         System image GUID: 0x0002c90200230787
> >         Port 1:
> >                 State: Active
> >                 Physical state: LinkUp
> >                 Rate: 10
> >                 Base lid: 6
> >                 LMC: 0
> >                 SM lid: 6
> >                 Capability mask: 0x02500a6a
> >                 Port GUID: 0x0002c90200230785
> > ++++++++++++++++++++++++++++++++++++++++++++++
> > 
> > The reset command I am using is "ibportstate 7 1 reset", I also tried 
> > "ibportstate -D 0,1 1 reset", and it fails with the same result.
> >
> 
> Yea that is going to be a problem.  The problem is that effectively you 
> just
> disabled the connection to the switch.  A reset disables then enables the
> port.  Once the port is disabled the command can't talk to the switch any
> longer.  You will have to either reset the switch (power cycle) or go to
> another node and enable the port.  From the output you sent me it looks 
> like
> you don't have any other nodes on the switch, so I take it you are 
> resetting
> the switch to get the link to come back?
> 
> I thought there was a warning in the man page or in the help regarding 
> this
> situation but I don't see it now.
> 
> Also, this becomes worse if you disable the port the SM is on.  (Which I 
> see
> you are doing.)  So you will have a noticeable delay while the SM rescans 
> the
> network which it is now seeing "again" for the first time.
> 
> BTW, What are you trying to achieve with this command?
> 
> Hope this helps,
> Ira
> 
> 
> _____________________________________________________________________________
> Scanned by IBM Email Security Management Services powered by MessageLabs. 
> For more information please visit http:// www. ers.ibm.com
> _____________________________________________________________________________
> 
> 
> 
> _____________________________________________________________________________
> Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http:// www. ers.ibm.com
> _____________________________________________________________________________


-- 
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
weiny2 at llnl.gov



More information about the general mailing list