[openib-general] [PATCH] validate MADs issued from userspace for spec compliance C13-18.1.1

Hal Rosenstock halr at voltaire.com
Wed Jul 12 14:49:56 PDT 2006


On Wed, 2006-07-12 at 13:58, Sean Hefty wrote:
> >> I was starting / stopping openSM on different systems soon before running the
> >> tests.
> >
> >Not sure I quite understand the sequencing.
> 
> I was being somewhat random, just trying to stress things.  

> How quickly will one SM take over for another after one dies?

With the default sminfo_polling_timeout of 10 seconds and default
polling_retry_number of 4, so the total handoff time should be around 40
seconds. I just did that experiment with 2 SMs and saw that as well.

> >Can you run with -V and send me the output ? I want to recreate this so
> >I understand what is going on.
> 
> I'm having trouble re-creating the error at the moment, but I isolated my test
> systems from our larger cluster.  I will need to reconnect to the cluster and
> see if I can cause the error again.

That's another difference. I've never run osmtest in a large subnet.

-- Hal

> - Sean





More information about the general mailing list