[openib-general] cm crash
Michael S. Tsirkin
mst at mellanox.co.il
Sun May 7 07:37:05 PDT 2006
Hello!
I have observed the crash below. From SDP messages, this seems to happen
when I am calling rdma_destroy_cm_id and at the same time a handler
is running and cma is getting non-zero return code from the callback.
My analysis of the failure:
cm_process_work does:
cm_deref_id(cm_id_priv);
if (ret)
ib_destroy_cm_id(&cm_id_priv->id);
assume that another thread calls ib_destroy_cm_id.
Now
wait_event(cm_id_priv->wait, !atomic_read(&cm_id_priv->refcount));
while ((work = cm_dequeue_work(cm_id_priv)) != NULL)
cm_free_work(work);
kfree(cm_id_priv->compare_data);
kfree(cm_id_priv->private_data);
kfree(cm_id_priv);
once the reference count reaches 0, this thread will wake.
We now have two threads running destroy on the same id!
Forwarded message from "Michael S. Tsirkin" <mst at mellanox.co.il> -----
Subject: crash
Date: Sun, 7 May 2006 16:37:43 +0300
From: "Michael S. Tsirkin" <mst at mellanox.co.il>
Reply-To: "Michael S. Tsirkin" <mst at mellanox.co.il>
sdp_sock(32769:0): socket is being torn down
idr_remove called for id=1064 which is not allocated.
sdp_sock(32769:0): socket is being torn down
idr_remove called for id=1064 which is not allocated.
Call Trace: <ffffffff802468c6>{idr_remove+278}
<ffffffff8805bd03>{:ib_cm:ib_destroy_cm_id+476}
<ffffffff80173b26>{cache_free_debugcheck+568}
<ffffffff8805bec5>{:ib_cm:cm_process_work+206}
<ffffffff8805d640>{:ib_cm:cm_work_handler+2732}
<ffffffff80141ad0>{run_workqueue+167}
<ffffffff8805cb94>{:ib_cm:cm_work_handler+0}
<ffffffff80141b88>{worker_thread+0}
<ffffffff80141c87>{worker_thread+255}
<ffffffff8012b68e>{default_wake_function+0}
<ffffffff8012b68e>{default_wake_function+0}
<ffffffff80141b88>{worker_thread+0}
<ffffffff80145674>{kthread+210} <ffffffff8010b7f6>{child_rip+8}
<ffffffff801455a2>{kthread+0} <ffffffff8010b7ee>{child_rip+0}
Slab corruption: start=ffff81017a99ab10, len=512
--
What item should I pick to always win in rock, scissors, paper?
----- End forwarded message -----
--
MST
More information about the general
mailing list