[ofa-general] Nodes dropping out of IPoIB mcast group due to a temporary node soft lockup.
Hal Rosenstock
hrosenstock at xsigo.com
Thu Apr 24 12:07:03 PDT 2008
On Thu, 2008-04-24 at 09:57 -0700, Ira Weiny wrote:
> > One side comment on the non OpenSM aspect of this:
> >
> > Why is the node temporarily unavailable ? There is a "contract" that the
> > node makes with the SM that it clearly isn't honoring. Is any
> > investigation going on relative to this aspect of the issue ?
> >
>
> Yes, we are working on finding the root cause. I agree that the "contract" is
> not being honored. This is one of the reasons I was hesitant to implement any
> fix to be submitted.
I think the two issues can be tackled in parallel.
> I don't think this is truly a bug in the stack.
Any ideas on what it is ? If not, would you be willing to try something
assuming the end node issue is easily reproducible ?
> However, I could see this causing issues for people[*] and it might be nice to
> have a "fix".
Sure; both are issues which should be understood better and fixed IMO.
-- Hal
> Ira
>
> [*] Particularly those who do not have any other connection to nodes other than
> IB.
More information about the general
mailing list