[openib-general] CMA oops
mshefty at ichips.intel.com
Wed Aug 30 12:17:58 PDT 2006
Michael S. Tsirkin wrote:
> Apparently, list->prev pointer in CMA id_priv structure is NULL
> which causes a crash in list_del.
> I note that rdma_destroy_id tests outside the mutex lock.
> Could that be the problem?
> The problem is not unfortunately easily reproducible.
I think I see one bug, but it doesn't seem like its causing the crash that you saw.
It's possible that address resolution can complete at the same time that
rdma_destroy_id() is called. The addr_handler() will cause the rdma_cm_id to
attach to a device while destroy is running, which can come after the check for
id_priv->cma_dev is made. The result is that destroy will not detach from the
device, leaving the rdma_cm_id in the device list after its destruction.
I'm trying to come up with a fix for this, but I'm not convinced it's the
problem that you're seeing.
More information about the general