[openib-general] OpenSM crash

Hal Rosenstock halr at voltaire.com
Sat May 28 04:49:58 PDT 2005


On Fri, 2005-05-27 at 17:37, Hal Rosenstock wrote:
> On Fri, 2005-05-27 at 17:33, Roland Dreier wrote:
> >     > May 27 01:44:09 [43005960] -> osm_vl15_post: 4294967295 MADs on wire, 2 MADs outstanding.
> > 
> >     Hal> I take that back. That's just a lot of MADs have been sent
> >     Hal> (on the IB wire). OpenSM was probably up and running for a
> >     Hal> while...
> > 
> > I find it hard to believe that OpenSM has sent 4 billion MADs --
> > that's more than 1000 MADs a second for a solid month.  It also looks
> > very suspicious that the value is equal to ((unsigned int) -1).
>                                               ^^^^^^^^^^^^^^^^^^
> on a 32 bit machine.
> 
> Good point. The fact that it gets to -1 is significant as I think that
> is used as a magic value for some computations.

I'm pretty sure that I see a way this could have gone negative in the
vendor layer. I'm working on a patch for this.

-- Hal





More information about the general mailing list