[PATCH] opensm/opensm/osm_lid_mgr.c: set "send_set" when setting rereg bit (Was: Re: [ofa-general] Nodes dropping out of IPoIB mcast group due to a temporary node soft lockup.)
ogerlitz at voltaire.com
Sun Apr 27 01:47:54 PDT 2008
Ira Weiny wrote:
> I did not get any output with multicast_debug_level!
why should you, as from the node's point of view nothing has happened
(the exact param name is mcast_debug_level)
> Here is a patch which fixes the problem. (At least with the partial sub-nets
> configuration I explained before.) I will have to verify this fixes the problem
> I originally reported.
OK, good. Does this problem exist in the released openSM? if yes, what
would be the trigger for the SM to "really discover" (i.e do PortInfo
SET) this sub-fabric and how much time would it take to reach this
trigger, worst case wise?
The failure configuration you have set to reproduce the problem is very
untypical, I think. Since under common clos etc topologies which don't
have a 1:n blocking nature, failure of such link would cause re-route
etc by the SM which would not (and should not) be noted by the nodes (I
hope I am not falling into another problem here...)
More information about the general