[openib-general] Re: opensm problem ???

Hal Rosenstock halr at voltaire.com
Tue Nov 1 03:53:53 PST 2005


-----Forwarded Message-----


From: Hal Rosenstock <halr at voltaire.com>
To: Itamar Rabenstein <itamar at mellanox.co.il>
Cc: openib-general at openib.org, Eitan Zahavi <eitan at mellanox.co.il>
Subject: Re: opensm problem ???
Date: 31 Oct 2005 16:49:58 -0500

Hi Itamar,

On Wed, 2005-10-26 at 11:25, Itamar Rabenstein wrote:
> Hi All,
> I am running openib gen2 svn rev 3872 (kernel + user).
> my system is EM64T (x86_64) + SUSE9.3 + k2.6.13.4

I've run Opterons with 2.6.13 and not quite as recent svn 3850. I'm in
the process of updating to the latest now that I'm back. 

Do you still have this problem ? 

> I have arbel in memfree mode (fw 5.1.132) .

I don't have a memfree HCA (arbel or otherwise). It also appears you are
using more recent firmware than is generally available. Are you sure
it's unrelated to that ?

> my 2 ports are connected in loopback.

Loopback configuration works in general.

> I am running opensm but the links are not getting into ACTIVE.
> in the osm.log i see
> 
> Oct 26 16:59:25 366150 [43005960] -> __osm_vl15_poller: 1 QP0 MADs on
> wire, 1 outstanding, 0 unicasts sent, 1 total sent.
> 
> Oct 26 16:59:33 937993 [44007960] -> umad_receiver: ERR 5404: recv
> error on MAD sized umad (Interrupted system call)

It looks to me like the code in osm_vendor_ibumad.c::umad_receiver()
should handle this (just indicates this occured) and reissue the
umad_recv. It appears that the GetResp for NodeInfo is never received
yet this transaction doesn't timeout either which would have been what I
expected.

-- Hal

> 
> Does it works for others ?
> 
>     Itamar




More information about the general mailing list