[ofa-general] Two element types on cancel_list in cancel_mads()?

Hal Rosenstock hrosenstock at xsigo.com
Sat Dec 1 06:12:26 PST 2007


On Thu, 2007-11-29 at 16:43 -0800, Sean Hefty wrote:
> > In cancel_mads, elements from two different lists are added to the 
> > cancel_list:  wait_list and local_list.  Subsequent processing of the 
> > cancel_list treats all elements as struct ib_mad_send_wr_private, and 
> > uses the send_buf field of that structure.  But it appears to me that 
> > the items from local_list are actually of type struct 
> > ib_mad_local_private, and hence the reference to send_buf for these 
> > elements is incorrect.  Can you help me understand how this works?
> 
> I was looking at the local_list handling in cancel_mads() and the rest 
> of mad code myself.  Hal knows this part of the code better than I do, 
> maybe he can look here and see if there's a definite problem.  This 
> looks like the cause of the bug Dotan just reported.

Sorry for the slow response. I've been consumed with other matters for
the last couple days.

I started investigating this and found that this change was first
introduced over 2 years ago by the following:

commit 2c153b934dca08d58e0aafde18a182e0891aa201
Author: Hal Rosenstock <halr at voltaire.com>
Date:   Wed Jul 27 11:45:31 2005 -0700

    [PATCH] IB: Eliminate MAD cache leak associated with local completions
    
    Eliminate MAD cache leak associated with local completions.  Also, when
    canceling MAD, empty local completion list as well.
    
    Signed-off-by: Hal Rosenstock <halr at voltaire.com>
    Cc: Roland Dreier <rolandd at cisco.com>
    Signed-off-by: Andrew Morton <akpm at osdl.org>
    Signed-off-by: Linus Torvalds <torvalds at osdl.org>

More later...

-- Hal

> - Sean



More information about the general mailing list