[openib-general] cm crash

Michael S. Tsirkin mst at mellanox.co.il
Sun May 7 07:37:05 PDT 2006


Hello!
I have observed the crash below. From SDP messages, this seems to happen
when I am calling rdma_destroy_cm_id and at the same time a handler
is running and cma is getting non-zero return code from the callback.

My analysis of the failure:

cm_process_work does:

        cm_deref_id(cm_id_priv);
        if (ret)
                ib_destroy_cm_id(&cm_id_priv->id);

assume that another thread calls ib_destroy_cm_id.
Now 

        wait_event(cm_id_priv->wait, !atomic_read(&cm_id_priv->refcount));
        while ((work = cm_dequeue_work(cm_id_priv)) != NULL)
                cm_free_work(work);
        kfree(cm_id_priv->compare_data);
        kfree(cm_id_priv->private_data);
        kfree(cm_id_priv);

once the reference count reaches 0, this thread will wake.
We now have two threads running destroy on the same id!



Forwarded message from "Michael S. Tsirkin" <mst at mellanox.co.il> -----

Subject: crash
Date: Sun, 7 May 2006 16:37:43 +0300
From: "Michael S. Tsirkin" <mst at mellanox.co.il>
Reply-To: "Michael S. Tsirkin" <mst at mellanox.co.il>

sdp_sock(32769:0): socket is being torn down
idr_remove called for id=1064 which is not allocated.

sdp_sock(32769:0): socket is being torn down
idr_remove called for id=1064 which is not allocated.

Call Trace: <ffffffff802468c6>{idr_remove+278}
<ffffffff8805bd03>{:ib_cm:ib_destroy_cm_id+476}
       <ffffffff80173b26>{cache_free_debugcheck+568}
<ffffffff8805bec5>{:ib_cm:cm_process_work+206}
       <ffffffff8805d640>{:ib_cm:cm_work_handler+2732}
<ffffffff80141ad0>{run_workqueue+167}
       <ffffffff8805cb94>{:ib_cm:cm_work_handler+0}
<ffffffff80141b88>{worker_thread+0}
       <ffffffff80141c87>{worker_thread+255}
<ffffffff8012b68e>{default_wake_function+0}
       <ffffffff8012b68e>{default_wake_function+0}
<ffffffff80141b88>{worker_thread+0}
       <ffffffff80145674>{kthread+210} <ffffffff8010b7f6>{child_rip+8}
       <ffffffff801455a2>{kthread+0} <ffffffff8010b7ee>{child_rip+0}
Slab corruption: start=ffff81017a99ab10, len=512

-- 
What item should I pick to always win in rock, scissors, paper?

----- End forwarded message -----

-- 
MST



More information about the general mailing list