[ofa-general] [PATCH] ib/mad: fix incorrect access to items on local_list

Sean Hefty sean.hefty at intel.com
Fri Nov 30 09:59:50 PST 2007


In cancel_mads(), MADs are moved from the wait_list and local_list
to a cancel_list for processing.  However, the structures on these two
lists are not the same.  The wait_list references struct
ib_mad_send_wr_private, but local_list references struct
ib_mad_local_private.  Cancel_mads() treats all items moved to the
cancel_list as struct ib_mad_send_wr_private.  This leads to a system
crash when requests are moved from the local_list to the cancel_list.

Fix this by leaving local_list alone.  All requests on the local_list
have completed are just awaiting processing by a queued worker thread.

Bug (crash) reported by Dotan Barak <dotanb at dev.mellanox.co.il>.
Problem with local_list access reported by Robert Reynolds
<rreynolds at opengridcomputing.com>.

Signed-off-by: Sean Hefty <sean.hefty at intel.com>
---
This patch is untested.  Dotan, can you see if this fixes the crash that
you were seeing?

 drivers/infiniband/core/mad.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 91e62c3..7ef2c7c 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2284,8 +2284,6 @@ static void cancel_mads(struct ib_mad_agent_private *mad_agent_priv)
 
 	/* Empty wait list to prevent receives from finding a request */
 	list_splice_init(&mad_agent_priv->wait_list, &cancel_list);
-	/* Empty local completion list as well */
-	list_splice_init(&mad_agent_priv->local_list, &cancel_list);
 	spin_unlock_irqrestore(&mad_agent_priv->lock, flags);
 
 	/* Report all cancelled requests */




More information about the general mailing list