[ofa-general] [BUG report / PATCH] fix race in the core multicast management

Tue Sep 25 06:00:12 PDT 2007

Sean Hefty wrote:
>>> node 1 <-> switch A <-> switch B <-> switch C <-> SA

>> The host would only see port up/down events as of changes in the link
>> state in the local port or in the port which is connected to it through
>> the cable.

> So, if you brought the link down/up between switches A & B, node 1
> wouldn't receive any events, but it would be removed from the multicast
> group?

good catch!

Indeed, when the link between switches A and B goes down, per the view 
point of the SM, the whole sub-fabric across A is lost and hence the 
node is dropped from all the multicast groups it is joined to.

However, from the view point of the node, no port down is experienced.

When the A-B link goes up, the SM discovers all nodes across A and 
probes their ports, though this process a port active event --might-- be 
generated by the HCA FW, but I am not sure its mandatory.

Since the only trigger for ipoib to rejoin to multicast groups is 
delivery of event by the hw driver, namely one of: port down/up, lid 
change, sm lid change, client re-register. I think we might have a hole 
here if none of these events is generated.

Please note that through this discovery, at least one mad is sent from 
the SM to the node. If we enforce the SM to set the re-register bit 
--each-- time it discovers a node, then the bug is solved.

I will test this scheme and let you know what I get (with the voltaire 
SM and mthca driver).

Eitan, Michael - any insight on the matter?

Or.