[openib-general] Re: [PATCH 3/3] iWARP CM

Thu Mar 16 12:05:34 PST 2006

On Thu, 2006-03-16 at 09:12 -0800, Sean Hefty wrote:
> Tom Tucker wrote:
> > The iWARP CM prevents this from happening by having a state (DESTROYING)
> > that prevents events from being delivered after iw_destroy_cm_id has
> > returned. This approach avoids having either the kernel application
> > thread or the event thread blocked while waiting for the last reference
> > to be released. 
> 
> You need to consider destroy being called by a separate thread from the one 
> processing events.  An event can be generated, queued, and just about to 
> callback the user when the user calls destroy.  Place the event thread at the 
> top of the user's callback routine.  There's no way to halt the execution of the 
> callback at this point.  Now let the thread calling destroy execute and return 
> to the user.  The callback code is still running, but the user is not even aware 
> at this point.
> 
> > Unlike the IB side, the iWARP side has orderly shutdown semantics that
> > can delay the delivery of the CLOSE event for minutes. With this
> > implementation, life goes on and the object simply stays around until
> > the last reference is removed. 
> 
> Even in IB, there's a CM object that hangs around after the user has called 
> destroy, and it has returned.  This is fine; the user is unaware of this object.
> 
> > Please look at the handling of events in cm_event_handler. If the state
> > is DESTROYING, events are not queued for delivery. This handles events
> > that are generated by the provider after iw_destroy_cm_id has returned. 
> 
> The problem is when the user calls destroy at the same time that an event is 
> being generated.  If the event gets there first, a callback will run.  Destroy 
> does not wait for that callback to complete before returning.  Hopefully, I've 
> explained the situation a little better.
> 

You explained it fine the first time...I just missed the subtlety of the
race you were referring to. Here is a patch that I believe closes this
race.

--- iwcm.c.orig 2006-03-16 13:49:22.000000000 -0600
+++ iwcm.c      2006-03-16 13:46:50.000000000 -0600
@@ -53,10 +53,15 @@
 struct iwcm_id_private {
        struct iw_cm_id id;
 
+       unsigned long flags;
+       wait_queue_head_t destroy_wait;
+       
        spinlock_t lock;
        atomic_t refcount;
 };
 
+#define IWCM_FLAGS_DESTROY_WAIT 1
+
 struct iwcm_work {
        struct work_struct work;
        struct iwcm_id_private *cm_id;
@@ -92,6 +97,7 @@
 
        spin_lock_init(&iwcm_id_priv->lock);
        atomic_set(&iwcm_id_priv->refcount, 1);
+       init_waitqueue_head(&iwcm_id_priv->destroy_wait);
 
        return &iwcm_id_priv->id;
 
@@ -222,6 +228,9 @@
                cm_id->state = IW_CM_STATE_DESTROYING;
                spin_unlock_irqrestore(&iwcm_id_priv->lock, flags);
        }
+       wait_event(iwcm_id_priv->destroy_wait, 
+                  !test_bit(IWCM_FLAGS_DESTROY_WAIT,&iwcm_id_priv->flags));
+
        iwcm_deref_id(iwcm_id_priv);
 }
 EXPORT_SYMBOL(iw_destroy_cm_id);
@@ -550,6 +559,7 @@
 
        /* Ignore scheduled events on a destroyed connection */
        if (cm_id_priv->id.state != IW_CM_STATE_DESTROYING) {
+               set_bit(IWCM_FLAGS_DESTROY_WAIT, &cm_id_priv->flags);
                switch (work->event.event) {
                case IW_CM_EVENT_CONNECT_REQUEST:
                        cm_conn_req_handler(work);
@@ -569,6 +579,9 @@
                default:
                        BUG_ON(1);
                }
+               clear_bit(IWCM_FLAGS_DESTROY_WAIT, &cm_id_priv->flags);
+               if (waitqueue_active(&cm_id_priv->destroy_wait))
+                       wake_up(&cm_id_priv->destroy_wait);
        }
 
        kfree(work);


> - Sean