[Openib-windows] a race in IBAL while CQ destroying

Fabian Tillier ftillier at silverstorm.com
Mon May 15 13:29:13 PDT 2006


Hi Leo,

On 5/15/06, Leonid Keller <leonid at mellanox.co.il> wrote:
>
> Hi Fab,
>
> While running a test which creates/destroys a lot of CQs from several
> threads, I get an assertion 'ref_cnt != 1' from ref_al_obj() in one thread,
> while another thread is in the middle of destroy_cq. It seems like a race
> between a process of asyncronous cq destroy and other processes, using it.
> I see also other failures like INVALID_CQ_HANDLE, which hints also on using
> a released CQ.
>
> The assertion in 'ref_cnt != 1' suggests that its an invalid state, but
> ref_al_obj() doesn't tell that to the caller.
>
> A patch below makes ref_al_obj() to return the ref_cnt and checks this value
> in __process_comp_cb() in order to skip the handling of a CQ, being
> destroyed.
> I guess, there ore more functions, that need to check the results of
> ref_al_obj().

Thanks, I applied this with a minor change in 349 - the ASSERT will
only fire if the ref_cnt == 1 and the object type is not a CQ.  That
way, we still trap other issues.

- Fab



More information about the ofw mailing list