[openib-general] CMA oops

Michael S. Tsirkin mst at mellanox.co.il
Wed Aug 30 13:06:04 PDT 2006


Quoting r. Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [openib-general] CMA oops
> 
> Michael S. Tsirkin wrote:
> > Apparently, list->prev pointer in CMA id_priv structure is NULL
> > which causes a crash in list_del.
> > 
> > I note that rdma_destroy_id tests outside the mutex lock.
> > Could that be the problem?
> > The problem is not unfortunately easily reproducible.
> 
> I think I see one bug, but it doesn't seem like its causing the crash that you saw.
> 
> It's possible that address resolution can complete at the same time that 
> rdma_destroy_id() is called.  The addr_handler() will cause the rdma_cm_id to 
> attach to a device while destroy is running, which can come after the check for 
> id_priv->cma_dev is made.  The result is that destroy will not detach from the 
> device, leaving the rdma_cm_id in the device list after its destruction.
> 
> I'm trying to come up with a fix for this, but I'm not convinced it's the 
> problem that you're seeing.

Could be what you describe leads to a memory corruption.

-- 
MST




More information about the general mailing list