[openib-general] RE: cm crash
Sean Hefty
sean.hefty at intel.com
Sun May 7 19:57:43 PDT 2006
>cm_process_work does:
>
> cm_deref_id(cm_id_priv);
> if (ret)
> ib_destroy_cm_id(&cm_id_priv->id);
>
>assume that another thread calls ib_destroy_cm_id.
>Now
>
> wait_event(cm_id_priv->wait, !atomic_read(&cm_id_priv->refcount));
> while ((work = cm_dequeue_work(cm_id_priv)) != NULL)
> cm_free_work(work);
> kfree(cm_id_priv->compare_data);
> kfree(cm_id_priv->private_data);
> kfree(cm_id_priv);
>
>once the reference count reaches 0, this thread will wake.
>We now have two threads running destroy on the same id!
This is a user issue where they try to destroy the cm_id twice. A user cannot
call ib_destroy_cm_id() and return non-zero from a callback on that same ID.
We cannot fix this in the CM. If the thread calling ib_destroy_cm_id() is
delayed, then the callback handler will return, and cleanup will occur. The
thread calling ib_destroy_cm_id() will then reference invalid memory.
- Sean
More information about the general
mailing list