[ofa-general] ipoib_start_xmit Gratuitous ARP / bonding failover handling not applied on connected mode neighbours?!

Or Gerlitz ogerlitz at voltaire.com
Thu Jan 17 05:47:55 PST 2008


Roland Dreier wrote:
> Good question.  The device test came straight from Moni's patch -- how
> much have you guys tested bonding of IPoIB CM?

The test for neigh->dev != dev comes to handle a possible race where a 
fail over occurs under a high xmit rate, so the deletion of the 
ipoib_neigh portion of the neighbour causes by the bonding fail-over 
did not happen yet, but as of the fail-over the bonding is now xmitting 
through a device which is not the one that created the ipoib_neigh.

We have never managed to reproduce a hit on this check... anyway, I will 
  double check on how much testing was done with the bonding and 
connected mode.

> The GID comparison seems a little trickier to handle -- it seems on a
> neighbour GID change we need to tear down any connection we might have
> in the CM case...

not really: when there is a hit on the GID comparison ipoib_neigh_free() 
is called which for a connected mode neighbour will invoke 
ipoib_cm_destroy_tx() which will disconnect etc.

Or




More information about the general mailing list