[openib-general] Re: [PATCH] Opensm - exiting issues

Hal Rosenstock halr at voltaire.com
Mon Nov 7 06:20:56 PST 2005


Hi Yael,

On Mon, 2005-11-07 at 08:25, Yael Kalka wrote:
> Hi Hal,
>
> There was a problem when running opensm with -o option, that caused
> the opensm to always exit with segfault, due to object destruction
> ordering. Also - there is the known issue of exiting opensm. We've
> done some clearing to the exiting code. The following patch fixes most
> of it.

I applied this part of the patch with some cosmetic changes in
osm_vendor_ibumad.c.

> In the current code we saw that sometimes opensm gets "stuck" on exit,
> and causes the machine to get stuck too - resulting in need for
> rebooting. In the following patch fixes most of it.
> We did run (in the patch) into rare cases where opensm exits with an
> error, but at least it exits without stucking the machine...

Is there a reliable way to recreate machine "stuck" ? What exactly do
you mean by this ?

All umad_unregister does is some validation, a table lookup, and issue
the ioctl to unregister the MAD agent. Not explictly unregistering the
agent(s) does not cause any harm as when the fd is closed, this will
occur as part of the cleanup.

-- Hal






More information about the general mailing list