[ofa-general] OpenSM initialization error

Yicheng Jia YJia at tmriusa.com
Fri Mar 27 15:20:08 PDT 2009


> That's max'd out so should be cleared to see if it is continually 
incrementing.

The number doesn't change after the subnet is up.

>Get the portinfo on the SM peer port. Something like:

>smpquery -D portinfo 0,1

>assuming you are doing this from the SM node.

What if I am running it on a node other than SM node? I don't have tools 
to run on SM node now.

> So different cables are in use ? Same HCA for OpenSM though ?

The cables are all the same type. HCAs are all same as well.

Thanks!

Yicheng Jia





Hal Rosenstock <hal.rosenstock at gmail.com> 
03/27/2009 04:55 PM

To
Yicheng Jia <YJia at tmriusa.com>
cc
general at lists.openfabrics.org, hnrose at comcast.net
Subject
Re: [ofa-general] OpenSM initialization error






On Fri, Mar 27, 2009 at 4:51 PM, Yicheng Jia <YJia at tmriusa.com> wrote:
>
> The "port_rcv_errors" attribute on OpenSM side of SM link is 65535, 
other
> errors attributes are all 0.

That's max'd out so should be cleared to see if it is continually 
incrementing.

Port Rcv Errors are the following:
• Local physical errors (ICRC, VCRC, LPCRC, and all physical
errors that cause entry into the BAD PACKET or BAD PACKET
DISCARD states of the packet receiver state machine)
• Malformed data packet errors (LVer, length, VL)
• Malformed link packet errors (operand, length, VL)
• Packets discarded due to buffer overrun

> How can I check SM link status on the switch side?

Get the portinfo on the SM peer port. Something like:

smpquery -D portinfo 0,1

assuming you are doing this from the SM node.

> Is there anything I can do if the SM link port on the switch side is
> not in init state?

Let's see if that's the case first.

> I don't think it's cable issue. I have 3 unmanaged switches on different
> system, all have the same problem.

So different cables are in use ? Same HCA for OpenSM though ?

-- Hal

> Thanks!
>
> Yicheng Jia
>
>
>
>
> Hal Rosenstock <hal.rosenstock at gmail.com>
>
> 03/27/2009 03:19 PM
>
> To
> Yicheng Jia <YJia at tmriusa.com>
> cc
> general at lists.openfabrics.org, hnrose at comcast.net
> Subject
> Re: [ofa-general] OpenSM initialization error
>
>
>
>
> On Fri, Mar 27, 2009 at 4:18 PM, Hal Rosenstock
> <hal.rosenstock at gmail.com> wrote:
>> On Fri, Mar 27, 2009 at 4:13 PM, Yicheng Jia <YJia at tmriusa.com> wrote:
>>>
>>> Yes. This persists for nearly 1 minute.
>>>
>>> One thing I notice is if I start mthca driver more than 1 minute 
before
>>> OpenSM starts, then there's no such error message anymore, this is 
gone.
>>> So
>>> I guess there's some handshaking going on between mthca driver and
>>> unmanaged
>>> switch firmware, which stablize communication before OpenSM starts. Do
>>> you
>>> have any idea about it?
>>
>> The physical link is brought up all on its own assuming the ports are
>> not disabled. Maybe the SM link being used is not in init yet. I don't
>> think that is well handled.
>
> What do the port counters say on both sides of the SM link ?
>
> Maybe try different cables on the SM link and see if this gets better.
>
> -- Hal
>
>> -- Hal
>>
>>> Thanks!
>>>
>>> Yicheng Jia
>>>
>>>
>>>
>>>
>>> Hal Rosenstock <hal.rosenstock at gmail.com>
>>>
>>> 03/27/2009 03:02 PM
>>>
>>> To
>>> Yicheng Jia <YJia at tmriusa.com>
>>> cc
>>> general at lists.openfabrics.org, hnrose at comcast.net
>>> Subject
>>> Re: [ofa-general] OpenSM initialization error
>>>
>>>
>>>
>>>
>>> On Fri, Mar 27, 2009 at 3:55 PM, Yicheng Jia <YJia at tmriusa.com> wrote:
>>>>
>>>> Here is the output of saquery. I am using Qlogic 9024 FC unmanaged
>>>> switch.
>>>> Only one SM exists in the subnet.
>>>
>>> <snip...>
>>>
>>> Is this topology simply one switch surrounded by HCAs ? If so, I don't
>>> have a theory as to why this message persists for 1 second.
>>>
>>> -- Hal
>>>
>>>
>>> 
_____________________________________________________________________________
>>> Scanned by IBM Email Security Management Services powered by 
MessageLabs.
>>> For more information please visit
>>> http://www.ers.ibm.com
>>>
>>> 
_____________________________________________________________________________
>>>
>>>
>>>
>>> 
_____________________________________________________________________________
>>> Scanned by IBM Email Security Management Services powered by 
MessageLabs.
>>> For more information please visit http://www.ers.ibm.com
>>>
>>> 
_____________________________________________________________________________
>>>
>>
>
> 
_____________________________________________________________________________
> Scanned by IBM Email Security Management Services powered by 
MessageLabs.
> For more information please visit http://www.ers.ibm.com
> 
_____________________________________________________________________________
>
>
> 
_____________________________________________________________________________
> Scanned by IBM Email Security Management Services powered by 
MessageLabs.
> For more information please visit http://www.ers.ibm.com
> 
_____________________________________________________________________________
>

_____________________________________________________________________________
Scanned by IBM Email Security Management Services powered by MessageLabs. 
For more information please visit http://www.ers.ibm.com
_____________________________________________________________________________



_____________________________________________________________________________
Scanned by IBM Email Security Management Services powered by MessageLabs. For more information please visit http://www.ers.ibm.com
_____________________________________________________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090327/4f106c39/attachment.html>


More information about the general mailing list