[ofa-general] [BUG report / PATCH] fix race in the core multicast management
Or Gerlitz
ogerlitz at voltaire.com
Tue Sep 25 06:00:12 PDT 2007
Sean Hefty wrote:
>>> node 1 <-> switch A <-> switch B <-> switch C <-> SA
>> The host would only see port up/down events as of changes in the link
>> state in the local port or in the port which is connected to it through
>> the cable.
> So, if you brought the link down/up between switches A & B, node 1
> wouldn't receive any events, but it would be removed from the multicast
> group?
good catch!
Indeed, when the link between switches A and B goes down, per the view
point of the SM, the whole sub-fabric across A is lost and hence the
node is dropped from all the multicast groups it is joined to.
However, from the view point of the node, no port down is experienced.
When the A-B link goes up, the SM discovers all nodes across A and
probes their ports, though this process a port active event --might-- be
generated by the HCA FW, but I am not sure its mandatory.
Since the only trigger for ipoib to rejoin to multicast groups is
delivery of event by the hw driver, namely one of: port down/up, lid
change, sm lid change, client re-register. I think we might have a hole
here if none of these events is generated.
Please note that through this discovery, at least one mad is sent from
the SM to the node. If we enforce the SM to set the re-register bit
--each-- time it discovers a node, then the bug is solved.
I will test this scheme and let you know what I get (with the voltaire
SM and mthca driver).
Eitan, Michael - any insight on the matter?
Or.
More information about the general
mailing list