[ewg] Re: Possible process deadlock in RMPP flow
Eli Cohen
eli at dev.mellanox.co.il
Tue Oct 20 00:48:59 PDT 2009
On Mon, Oct 19, 2009 at 01:30:47PM -0700, Sean Hefty wrote:
>
> I can't find anything off in the code for this. It's odd, since
> unregister_mad_agent() does:
>
> flush_workqueue(port_priv->wq);
> ib_cancel_rmpp_recvs(mad_agent_priv);
>
> and ib_cancel_rmpp_recvs() does:
>
> spin_lock_irqsave(&agent->lock, flags);
> list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) {
> cancel_delayed_work(&rmpp_recv->timeout_work);
> cancel_delayed_work(&rmpp_recv->cleanup_work);
> }
> spin_unlock_irqrestore(&agent->lock, flags);
>
> flush_workqueue(agent->qp_info->port_priv->wq);
>
> which basically just flushes the same work queue.
>
> I haven't been able to reproduce the problem, but I'm running the latest kernel
> - not sure that matters in this case. Does ibnetdiscover just hang forever at
> the end of the test when this occurs? Is there any more information available?
>
We are checking if the problem is a firmware bug, it looks like it.
Once we verify this I will send an update.
More information about the ewg
mailing list