[openib-general] Re: [PATCH 3/3] iWARP CM

Thu Mar 16 07:26:40 PST 2006

On Thu, 2006-03-16 at 00:39 -0800, Sean Hefty wrote:
> >On Wed, 2006-03-15 at 16:34 -0800, Sean Hefty wrote:
> >> Tom Tucker wrote:
> >> > +static inline void iwcm_deref_id(struct iwcm_id_private *cm_id_priv)
> >> > +{
> >> > +	if (atomic_dec_and_test(&cm_id_priv->refcount))
> >> > +		kfree(cm_id_priv);
> >> > +}
> >>
> >> I'm wary of code that does this.  This can typically result in a race
> >condition
> >> where the user can receive a callback after destroy returns.
> >
> >Yep. But the alternative is a deadlock in the event thread (the event
> >that removes the last reference is behind the event that blocks trying
> >to destroy the id). This happens because when both peers try to
> >disconnect concurrently.
> 
> The IB CM has to deal with peer disconnects as well.  If you can deadlock
> because a user calls destroy from the event thread, then we need to disallow
> that.  (As a general rule, a user can never call destroy on an object while in a
> callback associated with that object.)  This is the reason why the IB CM and CMA
> serialize all events to a single cm_id and allow returning a non-zero value from
> a callback to signal destruction of the cm_id.

The iWARP CM prevents this from happening by having a state (DESTROYING)
that prevents events from being delivered after iw_destroy_cm_id has
returned. This approach avoids having either the kernel application
thread or the event thread blocked while waiting for the last reference
to be released. 

Unlike the IB side, the iWARP side has orderly shutdown semantics that
can delay the delivery of the CLOSE event for minutes. With this
implementation, life goes on and the object simply stays around until
the last reference is removed. 

This implementation is consistent with the way many other network
objects are managed (e.g. sk_buff).

> 
> >There is an iWARP provider requirement that CLOSE is absolutely the last
> >event...I was wary too, so to test my implementation I ran 6 threads 3
> >IB, 3 iWARP on two dual CPU boxes.
> 
> Based on a code review, I'm fairly certain that this issue exists and needs to
> be fixed.  A client has the potential to receive a callback after destroy has
> returned, which can lead to accessing an invalid context.
> 
Please look at the handling of events in cm_event_handler. If the state
is DESTROYING, events are not queued for delivery. This handles events
that are generated by the provider after iw_destroy_cm_id has returned. 

Then please look at cm_work_handler, the check for DESTROYING state
handles events that are already queued for delivery, but were not yet
delivered at the time iw_destroy_cm_id returned. 

I think these checks avoid the condition you are concerned about.

> - Sean
>