[ofa-general] [PATCH 0/2] IB: Improve recovery from SM change events after takeover

Moni Shoua monis at Voltaire.COM
Sun May 18 05:25:24 PDT 2008


The patches below improve the the recovery of the IPoIB driver from
a faulure of  the SM and taking over by another SM. The purpose was
to minimize the the time that 2 hosts with IPoIB stay remain disconnected
after SM takeover event. 

Here is an example that was viewed in our tests.
One IPoIB host (client) sends a stream of multicast packets to another IPoIB host (server).
SM takeover event takes place during traffic and as a result multicast info is flushed
and there is a need to rejoin by hosts. Without the patch there is a chance (which according to our experience
is a very big chance) that the request to rejoin will be to the old  SM and only after a retry join completes successfully.

Our tests for IP multicast and unicast traffic between 2 hosts show that without the patch there
is a period of time of  up to 5  seconds  that that communication is lost and with the
patch the time decreases  to less than a second.





More information about the general mailing list