[ofa-general] Nodes dropping out of IPoIB mcast group due to a temporary node soft lockup.

Hal Rosenstock hrosenstock at xsigo.com
Thu Apr 24 12:07:03 PDT 2008


On Thu, 2008-04-24 at 09:57 -0700, Ira Weiny wrote:

> > One side comment on the non OpenSM aspect of this: 
> > 
> > Why is the node temporarily unavailable ? There is a "contract" that the
> > node makes with the SM that it clearly isn't honoring. Is any
> > investigation going on relative to this aspect of the issue ?
> > 
> 
> Yes, we are working on finding the root cause.  I agree that the "contract" is
> not being honored.  This is one of the reasons I was hesitant to implement any
> fix to be submitted. 

I think the two issues can be tackled in parallel.

> I don't think this is truly a bug in the stack.

Any ideas on what it is ? If not, would you be willing to try something
assuming the end node issue is easily reproducible ?

> However, I could see this causing issues for people[*] and it might be nice to
> have a "fix".

Sure; both are issues which should be understood better and fixed IMO.

-- Hal

> Ira
> 
> [*] Particularly those who do not have any other connection to nodes other than
> IB.




More information about the general mailing list