[ofa-general] Both opensm's are in SMINFO_STANDBY and none of	them claims master
    Venkatesh Babu 
    venkatesh.babu at 3leafnetworks.com
       
    Tue May 22 17:01:32 PDT 2007
    
    
  
Hal Rosenstock wrote:
>The one I see that might be related is the following:
>
>commit 39798695b4bcc7b145f8910ca56195808d3a7637
>Author: Roland Dreier <rolandd at cisco.com>
>Date:   Mon Nov 13 09:38:07 2006 -0800
>
>    IB/mad: Fix race between cancel and receive completion
>    
>    When ib_cancel_mad() is called, it puts the canceled send on a list
>    and schedules a "flushed" callback from process context.  However,
>    this leaves a window where a receive completion could be processed
>    before the send is fully flushed.
>    
>    This is fine, except that ib_find_send_mad() will find the MAD and
>    return it to the receive processing, which results in the sender
>    getting both a successful receive and a "flushed" send completion for
>    the same request.  Understandably, this confuses the sender, which is
>    expecting only one of these two callbacks, and leads to grief such as
>    a use-after-free in IPoIB.
>    
>    Fix this by changing ib_find_send_mad() to return a send struct only
>    if the status is still successful (and not "flushed").  The search of
>    the send_list already had this check, so this patch just adds the same
>    check to the search of the wait_list.
>    
>    Signed-off-by: Roland Dreier <rolandd at cisco.com>
>
>My search was not exhaustive.
>  
>
  It looks like this may be the fix for the MAD send errors. Do you 
think this is the cause of opensm not grabbing the mastership from the 
other ?
>
>Are they incrementing ? Which node is this ? I think some of them would
>increment on node reboot.
>  
>
  Looks like some counters (Symbol errors, link downed) are reached the 
top ceiling.
This output was captured on node vortex3l-83, the one who runs opensm.
Do you want the perfquery output before and after some time interval ?
 VBabu
    
    
More information about the general
mailing list