[ofa-general] [PATCH/RFC] IB/mad: Fix lock-lock-timer deadlock in RMPP code (was: [NEW PATCH] IB/mad: Fix possible lock-lock-timer deadlock)

Roland Dreier rdreier at cisco.com
Wed Sep 9 13:42:21 PDT 2009


Holding agent->lock across cancel_delayed_work() (which does
del_timer_sync()) in ib_cancel_rmpp_recvs() leads to lockdep reports of
possible lock-timer deadlocks if a consumer ever does something that
connects agent->lock to a lock taken in IRQ context (cf
http://marc.info/?l=linux-rdma&m=125243699026045).

However, it seems this locking is not necessary here, since the locking
did not prevent the rmpp_list from having an item added immediately
after the lock is dropped -- so there must be sufficient synchronization
protecting the rmpp_list without the locking here.  Therefore, we can
fix the lockdep issue by simply deleting the locking.


Hal/Sean, does this look right to you?
---
 drivers/infiniband/core/mad_rmpp.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 57a3c6f..865c109 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -85,12 +85,10 @@ void ib_cancel_rmpp_recvs(struct ib_mad_agent_private *agent)
 	struct mad_rmpp_recv *rmpp_recv, *temp_rmpp_recv;
 	unsigned long flags;
 
-	spin_lock_irqsave(&agent->lock, flags);
 	list_for_each_entry(rmpp_recv, &agent->rmpp_list, list) {
 		cancel_delayed_work(&rmpp_recv->timeout_work);
 		cancel_delayed_work(&rmpp_recv->cleanup_work);
 	}
-	spin_unlock_irqrestore(&agent->lock, flags);
 
 	flush_workqueue(agent->qp_info->port_priv->wq);
 



More information about the general mailing list