[openib-general] Re: opensm and SIGINT

Viswanath Krishnamurthy viswa.krish at gmail.com
Thu Sep 22 12:06:37 PDT 2005


Hal,

On 22 Sep 2005 14:41:04 -0400, Hal Rosenstock <halr at voltaire.com> wrote:
>
> Hi Viswa,
>
> On Thu, 2005-09-22 at 14:37, Viswanath Krishnamurthy wrote:
> > Hi Hal,
> >
> > Sure will test it out. I see no issue in this fix. I have run the
> > following test overnight
> > in a script with yesterday's code
> >
> > 1. Start opensm
> > 2. Ping another node over IB
> > 3. Run osmtest (osmtest -f c, osmtest -f a)
> > 4. Kill opensm with -9 signal and repeat over
> >
> > The failures are captured in a log.
> >
> > This has run more than 2500 times without resource leak issues. I saw
> > about 150 osmtest
> > failures which I will followup with another mail.
>
> Some failures are intentional (bad flow tests). They are all not marked
> obviously. Some of this has been documented on the list but not fixed
> yet but I am interested in seeing what you are referring to.



I will attach the log later.

> Once opensm failed to start correctly with SUBNET UP message in the
> > log.
>
> So the subnet didn't come up and the ports didn't become active ? Just
> out of curiousity, could you unload and reload ib_umad and then start
> opensm when that occurs to see if that fixes things ? I'm not sure it
> would.



I do not think this would help. The system is never rebooted. Just opensm is
started and stopped. On the mext opensm start/stop the subnet came up. I
think it is more of an opensm issue than any kernel module issue.

Thanks.
>
> -- Hal
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050922/f0858ec3/attachment.html>


More information about the general mailing list