[ewg] Re: Possible process deadlock in RMPP flow

Sean Hefty sean.hefty at intel.com
Mon Oct 19 13:30:47 PDT 2009


>>> Thanks Or. This one is already in OFED 1.4.2 but apparently this is a
>>> different problem. Once I have information whether the patch Roland
>>> posted fixed it I will update the list.
>> Eli, did you find a commit that fixes the problem you reported on?
>>
>> Or.
>>
>>
>Not yet :-(

I can't find anything off in the code for this.  It's odd, since
unregister_mad_agent() does:

        flush_workqueue(port_priv->wq);
        ib_cancel_rmpp_recvs(mad_agent_priv);

and ib_cancel_rmpp_recvs() does:

        spin_lock_irqsave(&agent->lock, flags);
        list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) {
                cancel_delayed_work(&rmpp_recv->timeout_work);
                cancel_delayed_work(&rmpp_recv->cleanup_work);
        }
        spin_unlock_irqrestore(&agent->lock, flags);

        flush_workqueue(agent->qp_info->port_priv->wq);

which basically just flushes the same work queue.

I haven't been able to reproduce the problem, but I'm running the latest kernel
- not sure that matters in this case.  Does ibnetdiscover just hang forever at
the end of the test when this occurs?  Is there any more information available?

- Sean 




More information about the ewg mailing list