[PATCH] opensm/opensm/osm_lid_mgr.c: set "send_set" when setting rereg bit (Was: Re: [ofa-general] Nodes dropping out of IPoIB mcast group due to a temporary node soft lockup.)

Or Gerlitz ogerlitz at voltaire.com
Sun Apr 27 01:47:54 PDT 2008


Ira Weiny wrote:
>
> I did not get any output with multicast_debug_level!  
why should you, as from the node's point of view nothing has happened 
(the exact param name is mcast_debug_level)
>
> Here is a patch which fixes the problem.  (At least with the partial sub-nets
> configuration I explained before.)  I will have to verify this fixes the problem
> I originally reported.
OK, good. Does this problem exist in the released openSM? if yes, what 
would be the trigger for the SM to "really discover" (i.e do PortInfo 
SET) this sub-fabric and how much time would it take to reach this 
trigger, worst case wise?

The failure configuration you have set to reproduce the problem is very 
untypical, I think. Since under common clos etc topologies which don't 
have a 1:n blocking nature, failure of such link would cause re-route 
etc by the SM which would not (and should not) be noted by the nodes (I 
hope I am not falling into another problem here...)

Or.






More information about the general mailing list