***SPAM*** Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch

Ira Weiny weiny2 at llnl.gov
Thu Apr 16 17:53:03 PDT 2009


Yicheng,

I am hoping your system is small when I ask; could you send the output from: "iblinkinfo.pl -R"  When everything is up and running.

Also ibstat from the node you are attempting the reset on.  As well as the reset command you are using?

Thanks,
Ira

On Thu, 16 Apr 2009 19:12:00 -0400
Hal Rosenstock <hal.rosenstock at gmail.com> wrote:

> On Thu, Apr 16, 2009 at 6:06 PM, Yicheng Jia <YJia at tmriusa.com> wrote:
> >
> >> There's a race condition here that I was asking about. If the link
> >> initialization takes too long and doesn't complete (gets to init)
> >> prior to the enable trying to be sent to the switch, then you could
> >> see these results but since it's DOWN until reboot it's something
> >> different.
> >
> > I did the "reset" when ports on both side of the link are in INIT state and
> > LinkUp phys state.
> >
> >> If the disable/wait/enable worked that would've been another story.
> >
> > It fails too. Both ports go to DOWN after disable is issued and never come
> > back. How long am I supposed to wait?
> 
> Ideally you would see init before doing the enable but sounds like
> that's not occuring. Either you need low level debug to see why the
> link does not initialize at that point or get support from your
> CA/switch vendor(s). What's your CA device ?
> 
> -- Hal
> 
> > Thanks!
> > Yicheng Jia
> >
> >
> >
> >
> > Hal Rosenstock <hal.rosenstock at gmail.com>
> >
> > 04/16/2009 03:26 PM
> >
> > To
> > Yicheng Jia <YJia at tmriusa.com>
> > cc
> > Nicolas Morey-Chaisemartin <devel-ofed at morey-chaisemartin.com>,
> > general at lists.openfabrics.org
> > Subject
> > Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch
> >
> >
> >
> >
> > On Thu, Apr 16, 2009 at 3:47 PM, Yicheng Jia <YJia at tmriusa.com> wrote:
> >>
> >>> Are you resetting the switch from the peer HCA port or some other port
> >>> ? That's what Nicolas asked but I might have missed the answer.
> >>
> >> Yes, I am trying to reset from the peer HCA port. Is anything wrong with
> >> this?
> >
> > There's a race condition here that I was asking about. If the link
> > initialization takes too long and doesn't complete (gets to init)
> > prior to the enable trying to be sent to the switch, then you could
> > see these results but since it's DOWN until reboot it's something
> > different.
> >
> >>> Also, try disable (wait) and then enable and see if that works.
> >>
> >> It remains the same, the switch port is DOWN forever. No SMP massage could
> >> get to the switch port.
> >
> > Right; in down, the SMP can't be sent.
> >
> >>> If I recall correctly, you had those links which are taking a long time
> >>> to
> >>> initialize. If the link stays down forever after disable, this won't
> >>> work but I want to be sure.
> >>
> >> This is seperate issue.
> >
> > Since the link stays down yes. If the disable/wait/enable worked that
> > would've been another story.
> >
> > -- Hal
> >
> >> The "reset" command is tested on a single port HCA
> >> directly connected with Qlogic siwth. The HCA is plugged into a Linux
> >> machine. It is the simplest test environment.
> >>
> >> Thanks!
> >> Yicheng Jia
> >>
> >>
> >>
> >>
> >> Hal Rosenstock <hal.rosenstock at gmail.com>
> >>
> >> 04/16/2009 02:29 PM
> >>
> >> To
> >> Yicheng Jia <YJia at tmriusa.com>
> >> cc
> >> Nicolas Morey-Chaisemartin <devel-ofed at morey-chaisemartin.com>,
> >> general at lists.openfabrics.org
> >> Subject
> >> Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch
> >>
> >>
> >>
> >>
> >> On Thu, Apr 16, 2009 at 3:20 PM, Hal Rosenstock
> >> <hal.rosenstock at gmail.com> wrote:
> >>> On Thu, Apr 16, 2009 at 3:18 PM, Yicheng Jia <YJia at tmriusa.com> wrote:
> >>>>
> >>>> They both are POLLING before "reset".
> >>>
> >>> Then they _should_ come back to INIT.
> >>>
> >>> What does the local LDDS value say after reset ? Any way to get the
> >>> switch port LDDS value ?
> >>
> >> Are you resetting the switch from the peer HCA port or some other port
> >> ? That's what Nicolas asked but I might have missed the answer.
> >>
> >> Also, try disable (wait) and then enable and see if that works. If I
> >> recall correctly, you had those links which are taking a long time to
> >> initialize. If the link stays down forever after disable, this won't
> >> work but I want to be sure.
> >>
> >> -- Hal
> >>
> >>> -- Hal
> >>>
> >>>> Thanks!
> >>>> Yicheng Jia
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Hal Rosenstock <hal.rosenstock at gmail.com>
> >>>>
> >>>> 04/16/2009 01:53 PM
> >>>>
> >>>> To
> >>>> Yicheng Jia <YJia at tmriusa.com>
> >>>> cc
> >>>> Nicolas Morey-Chaisemartin <devel-ofed at morey-chaisemartin.com>,
> >>>> general at lists.openfabrics.org
> >>>> Subject
> >>>> Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Apr 16, 2009 at 11:12 AM, Yicheng Jia <YJia at tmriusa.com> wrote:
> >>>>>
> >>>>> Hi Nicolas,
> >>>>>
> >>>>> After this "reset" command, both ports are DOWN forever, I can only get
> >>>>> portinfo from local port.
> >>>>>
> >>>>> I am sure that the port that has been reset is not the local port,
> >>>>> otherwise
> >>>>> it will prompt "node type not switch" error.
> >>>>>
> >>>>> I tried to enable this switch port from another port and brought it to
> >>>>> POLLING state, but as long as I use "reset", both ports are DOWN.
> >>>>
> >>>> What are the peer port's LinkDownDefaultStates ? Sounds like one or
> >>>> more must be Sleeping rather than Polling for some reason.
> >>>>
> >>>> -- Hal
> >>>>
> >>>>> Thanks!
> >>>>>
> >>>>> Yicheng Jia
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> Nicolas Morey-Chaisemartin <devel-ofed at morey-chaisemartin.com>
> >>>>>
> >>>>> 04/16/2009 12:43 AM
> >>>>>
> >>>>> To
> >>>>> Yicheng Jia <YJia at tmriusa.com>
> >>>>> cc
> >>>>> general at lists.openfabrics.org
> >>>>> Subject
> >>>>> Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> By any chances have you not reset the port you're on?
> >>>>> Have you tried using another node to enable the port again?
> >>>>>
> >>>>> Nicolas
> >>>>>
> >>>>> Le 16/04/2009 00:45, Yicheng Jia a écrit :
> >>>>>>
> >>>>>> Hello Randy,
> >>>>>>
> >>>>>> I am trying to run "ibportstate reset" to reset the switch port on the
> >>>>>> other side in order to get 4x link. However I get the following error:
> >>>>>> ibwarn: [19660] mad_rpc: _do_madrpc failed; dport (Lid 7)
> >>>>>> ibportstate: iberror: failed: smp set portinfo failed
> >>>>>>
> >>>>>> And the port status change to DOWN after this. Have you ever tried to
> >>>>>> run "ibportstate" to reset the switch port?
> >>>>>>
> >>>>>> Thanks!
> >>>>>>
> >>>>>> Yicheng Jia
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ------------------------------
> >>>>>>
> >>>>>> Message: 2
> >>>>>> Date: Wed, 4 Mar 2009 18:39:54 -0600
> >>>>>> From: Randy Halverson <randy.halverson at qlogic.com>
> >>>>>> Subject: [ofa-general] link width problem of Qlogic 9024 unmanaged
> >>>>>> switch
> >>>>>> To: "'general at lists.openfabrics.org'" <general at lists.openfabrics.org>
> >>>>>> Message-ID:
> >>>>>> <88EC963376E93B4DB0F2A69D932F786903D89CAF at MNEXMB2.qlogic.org>
> >>>>>> Content-Type: text/plain; charset="us-ascii"
> >>>>>>
> >>>>>> Hello Yicheng,
> >>>>>>
> >>>>>> After checking internally, this appears to be a known problem with
> >>>>>> older
> >>>>>> firmware for the 9024FC switches.
> >>>>>>
> >>>>>> It appears that you or another person at 'tmriusa.com' has recently
> >>>>>> opened a case with QLogic Tech Support for this issue. Please continue
> >>>>>> to work with QLogic Tech Support on firmware upgrade resolution since
> >>>>>> you probably don't have our FastFabric Tools to manage the 9024FC
> >>>>>> switches..
> >>>>>>
> >>>>>> Regards,
> >>>>>>
> >>>>>> Randy
> >>>>>> Technical Support
> >>>>>> QLogic Corporation
> >>>>>> -------------- next part --------------
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _____________________________________________________________________________
> >>>>>> Scanned by IBM Email Security Management Services powered by
> >>>>>> MessageLabs. For more information please visit http:// www. ers.ibm.com
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _____________________________________________________________________________
> >>>>>> <http:// www. ers.ibm.com/>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _____________________________________________________________________________
> >>>>>> Scanned by IBM Email Security Management Services powered by
> >>>>>> MessageLabs. For more information please visit http:// www. ers.ibm.com
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _____________________________________________________________________________
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ------------------------------------------------------------------------
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> general mailing list
> >>>>>> general at lists.openfabrics.org
> >>>>>> http:// lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >>>>>>
> >>>>>> To unsubscribe, please visit
> >>>>>> http:// openib.org/mailman/listinfo/openib-general
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> _____________________________________________________________________________
> >>>>> Scanned by IBM Email Security Management Services powered by
> >>>>> MessageLabs.
> >>>>> For more information please visit http:// www. ers.ibm.com
> >>>>>
> >>>>>
> >>>>>
> >>>>> _____________________________________________________________________________
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> _____________________________________________________________________________
> >>>>> Scanned by IBM Email Security Management Services powered by
> >>>>> MessageLabs.
> >>>>> For more information please visit http:// www. ers.ibm.com
> >>>>>
> >>>>>
> >>>>>
> >>>>> _____________________________________________________________________________
> >>>>>
> >>>>> _______________________________________________
> >>>>> general mailing list
> >>>>> general at lists.openfabrics.org
> >>>>> http:// lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >>>>>
> >>>>> To unsubscribe, please visit
> >>>>> http:// openib.org/mailman/listinfo/openib-general
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> _____________________________________________________________________________
> >>>> Scanned by IBM Email Security Management Services powered by
> >>>> MessageLabs.
> >>>> For more information please visit http:// www. ers.ibm.com
> >>>>
> >>>>
> >>>> _____________________________________________________________________________
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> _____________________________________________________________________________
> >>>> Scanned by IBM Email Security Management Services powered by
> >>>> MessageLabs.
> >>>> For more information please visit http:// www. ers.ibm.com
> >>>>
> >>>>
> >>>> _____________________________________________________________________________
> >>>>
> >>>
> >>
> >>
> >> _____________________________________________________________________________
> >> Scanned by IBM Email Security Management Services powered by MessageLabs.
> >> For more information please visit http:// www. ers.ibm.com
> >>
> >> _____________________________________________________________________________
> >>
> >>
> >>
> >> _____________________________________________________________________________
> >> Scanned by IBM Email Security Management Services powered by MessageLabs.
> >> For more information please visit http:// www. ers.ibm.com
> >>
> >> _____________________________________________________________________________
> >>
> >
> > _____________________________________________________________________________
> > Scanned by IBM Email Security Management Services powered by MessageLabs.
> > For more information please visit http:// www. ers.ibm.com
> > _____________________________________________________________________________
> >
> >
> > _____________________________________________________________________________
> > Scanned by IBM Email Security Management Services powered by MessageLabs.
> > For more information please visit http:// www. ers.ibm.com
> > _____________________________________________________________________________
> >
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http:// lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http:// openib.org/mailman/listinfo/openib-general
> 


-- 
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
weiny2 at llnl.gov



More information about the general mailing list