[openib-general] [PATCH] Missing check for atomic_dec in ib_post_send_mad

Krishna Kumar krkumar at us.ibm.com
Mon Nov 1 16:38:03 PST 2004


Hi Sean,

I think it is reasonable to have current senders racing with unregister.
The unregister is waiting for all references to drop to zero before
freeing up the resources. It killed the ones waiting for responses
(mad_cancel), killed the ones who are executing in callback handlers,
and finally after dropping the loader's module refcnt, it waits for the
refcnt to drop to zero. These can only be threads which are actively
receiving mad packets and those threads in the process of sending mad
packets while the unregister was going on (and the ones which fail is
the only cause of the problem). Essentially I think the unregister will
hang and not free up the resource.

Thanks,

- KK

On Mon, 1 Nov 2004, Sean Hefty wrote:

> On Mon, 1 Nov 2004 16:12:18 -0800 (PST)
> Krishna Kumar <krkumar at us.ibm.com> wrote:
>
> > I believe the recent changes to catch all atomic_dec races with
> > unregister failed to catch one spot in ib_post_send_mad. This routine
> > increments mad_agent_priv->refcnt, and while unregister can run, if
> > the ib_send_mad() fails, we drop the refcnt without checking if the
> > refcnt has dropped to zero. The unregister would block indefinitely
> > waiting to be woken up. I think the rest of the atomic_dec's looks
> > good though.
>
> I looked at this area of the code, and my thought was that we cannot
> handle a client that tries to send a MAD at the same time that they
> unregister.  So, I think that a simple atomic_dec should be okay.  If a
> client is calling unregister in a separate thread, then they are
> essentially trying to send a MAD after unregistering, in which case our
> data structures have been freed.
>
> - Sean
>
>
> >  			*bad_send_wr = cur_send_wr;
> > -			atomic_dec(&mad_agent_priv->refcount);
> > +			if (atomic_dec_and_test(&mad_agent_priv->refcount))
> > +				wake_up(&mad_agent_priv->wait);
> >  			return ret;
> >  		}
>
>




More information about the general mailing list