[ofa-general] Re: [PATCH 1/2] opensm: avoid LASH use-after-free when switch is deleted from fabric.

Sasha Khapyorsky sashak at voltaire.com
Fri Sep 25 06:52:56 PDT 2009


Hi Jim,

On 14:29 Tue 22 Sep     , Jim Schutt wrote:
> 
> I'm working on another routing engine

That is interesting. What will be a key features of this new routing
engine?

> that also uses osm_switch_t:priv
> to point to data that persists between calls to the routing engine, as
> LASH does.  And like LASH my objects have a pointer to the corresponding
> osm_switch_t.
> 
> Since trying to implement this engine was my first experience with
> opensm, it checks these links before using them by making sure 
> that the osm_switch_t my object references points back to my
> object, because I'm paranoid.  And my engine doesn't overwrite
> pointers it expects to be NULL, because I'm really paranoid.
> 
> So under circumstances where I had two routing engines configured,
> if my engine failed over to LASH because of some problem caused by
> downing a switch in the fabric, then routing reverted back to my 
> engine when the problem cleared up, non-NULL osm_switch_t:priv
> values would keep my engine from working.
> 
> So I came up with this priv_release() business to provide a general
> way for the opensm core to clean up after unexpected behavior
> of a routing engine.

I think that such "debug-only" things are absolutely fine in
development development, but don't add a much in production run-time
(the exception could be some extremely complex flows, which would be
better avoided for another reasons :)). Also in some cases such extra
validations may hide a bugs.

> In the event you remove the 'p_sw->priv = NULL' line as the fix
> to the use-after-free issue,

I guess that it is simplest way to fix this for now.

> and I get my routing engine into 
> good enough shape to submit, should I resubmit this patch too,

If it is necessary for your code.

> or should I be less paranoid and remove the extra checks in
> my engine?

If this is only "debug" cases, I would suggest to clean this up after
the code stabilization. But let's see then...

Sasha



More information about the general mailing list