[openib-general] Re: [PATCH 3/3] iWARP CM

Thu Mar 16 00:39:38 PST 2006

>On Wed, 2006-03-15 at 16:34 -0800, Sean Hefty wrote:
>> Tom Tucker wrote:
>> > +static inline void iwcm_deref_id(struct iwcm_id_private *cm_id_priv)
>> > +{
>> > +	if (atomic_dec_and_test(&cm_id_priv->refcount))
>> > +		kfree(cm_id_priv);
>> > +}
>>
>> I'm wary of code that does this.  This can typically result in a race
>condition
>> where the user can receive a callback after destroy returns.
>
>Yep. But the alternative is a deadlock in the event thread (the event
>that removes the last reference is behind the event that blocks trying
>to destroy the id). This happens because when both peers try to
>disconnect concurrently.

The IB CM has to deal with peer disconnects as well.  If you can deadlock
because a user calls destroy from the event thread, then we need to disallow
that.  (As a general rule, a user can never call destroy on an object while in a
callback associated with that object.)  This is the reason why the IB CM and CMA
serialize all events to a single cm_id and allow returning a non-zero value from
a callback to signal destruction of the cm_id.

>There is an iWARP provider requirement that CLOSE is absolutely the last
>event...I was wary too, so to test my implementation I ran 6 threads 3
>IB, 3 iWARP on two dual CPU boxes.

Based on a code review, I'm fairly certain that this issue exists and needs to
be fixed.  A client has the potential to receive a callback after destroy has
returned, which can lead to accessing an invalid context.

- Sean