[ofa-general] [PATCH] ipiob: fix rtnl deadlock

Yossi Etigin yosefe at Voltaire.COM
Mon Aug 4 11:04:50 PDT 2008


Roland Dreier wrote:
>  > ipoib_stop is called with rtnl_lock, and flushes ipoib_workqueue.
>  > the flush operation might wait for mcast_join_task to finish, which
>  > in turn might wait for rtnl_lock.
> 
> when did we introduce this bug?

http://www.openfabrics.org/git/?p=ofed_1_4/linux-2.6.git;a=commit;h=529024117628d0037644a20b4870c61d63cea2a1
 
> 
>  > +		/* Avoid deadlock with ipoib_stop */
>  > +		while (!(ret = rtnl_trylock()) &&
>  > +		       test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
>  > +			yield();
>  > +
>  > +		if (ret) {
>  > +			dev_set_mtu(dev, min(priv->mcast_mtu, priv->admin_mtu));
>  > +			rtnl_unlock();
>  > +		} else
>  > +			ipoib_dbg_mcast(priv, "ignoring mtu setup because device is down\n");
> 
> this is rather horrible looking... is there any way we can avoid the
> loop on trylock?
> 

We can just give up if you can't get the lock, like it's done in 
drivers/net/cxgb3/cxgb3_main.c. 

Other solution might get messy, because you don't have control when 
the lock is actually locked, so you can't set any flags and such. 
These might be: flush the queue sometime later, set the mtu sometime
later on another workqueue.

>  - R.

-- 
--Yossi



More information about the general mailing list