[ofa-general] [BUG report / PATCH] fix race in the core multicast management

Tue Sep 25 12:21:31 PDT 2007

On Tue, 2007-09-25 at 21:21 +0200, Sasha Khapyorsky wrote:
> On 06:20 Tue 25 Sep     , Hal Rosenstock wrote:
> > On Tue, 2007-09-25 at 15:00 +0200, Or Gerlitz wrote:
> > > Sean Hefty wrote:
> > > >>> node 1 <-> switch A <-> switch B <-> switch C <-> SA
> > > 
> > > >> The host would only see port up/down events as of changes in the link
> > > >> state in the local port or in the port which is connected to it through
> > > >> the cable.
> > > 
> > > > So, if you brought the link down/up between switches A & B, node 1
> > > > wouldn't receive any events, but it would be removed from the multicast
> > > > group?
> > > 
> > > good catch!
> > > 
> > > Indeed, when the link between switches A and B goes down, per the view 
> > > point of the SM, the whole sub-fabric across A is lost and hence the 
> > > node is dropped from all the multicast groups it is joined to.
> > 
> > No, it is not (dropped from all multicast groups it is joined to). It
> > may be removed from the multicast forwarding tables if there is no route
> > available but it is still a member of the group.
> 
> I cannot see it. With normal flow OpenSM will get trap on switch ports
> disconnection, this will trigger heavy sweep and whole A sub-fabrics
> will be dropped right after discovery phase (including multicast groups
> - it is in __osm_drop_mgr_remove_port()).

I was talking "theory"/spec rather than OpenSM. There are a number of
ways to handle this.

> > > However, from the view point of the node, no port down is experienced.
> > > 
> > > When the A-B link goes up, the SM discovers all nodes across A and 
> > > probes their ports, though this process a port active event --might-- be 
> > > generated by the HCA FW, but I am not sure its mandatory.
> > > 
> > > Since the only trigger for ipoib to rejoin to multicast groups is 
> > > delivery of event by the hw driver, namely one of: port down/up, lid 
> > > change, sm lid change, client re-register. I think we might have a hole 
> > > here if none of these events is generated.
> 
> OpenSM will request client reregistration for all ports in A sub-fabric
> when it will be connected back and discovered again.

Other SMs may be capable of dealing with this with less "drastic"
measures than client reregistration.

-- Hal

> Sasha
> 
> > 
> > It doesn't need to rejoin for this case. See above explanation.
> > 
> > -- Hal
> > 
> > > Please note that through this discovery, at least one mad is sent from 
> > > the SM to the node. If we enforce the SM to set the re-register bit 
> > > --each-- time it discovers a node, then the bug is solved.
> > > 
> > > I will test this scheme and let you know what I get (with the voltaire 
> > > SM and mthca driver).
> > > 
> > > Eitan, Michael - any insight on the matter?
> > > 
> > > Or.
> > > 
> > > 
> > > _______________________________________________
> > > general mailing list
> > > general at lists.openfabrics.org
> > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > > 
> > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > 
> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general