[ofa-general] [PATCH 2/4] ipoib: fix loss of connectivity after bonding failover on both sides
Yossi Etigin
yosefe at Voltaire.COM
Wed Jan 7 11:42:22 PST 2009
Roland Dreier wrote:
> > also
> > initialize neigh->dgid.raw to have value to compare with.
>
> I don't see this anywhere in the patch you sent? Here's the whole thing:
> (btw only one "L" in "initialize")
>
That was removed from the patch because Moni Shoua found it had increased the
traffic renewal time in case of SM failover. I forgot to remove it from the
changelog as well.
--
Fix bonding failover in the case poth peers have failover and gratuitous arp
is lost.
In that case, ipoib sender side will create ipoib_neigh and issue a path request
with the old gid first. When skb->dst->neighbour->ha changes due to arp refresh,
ipoib_neigh will not be added to the path->list of the path of the new mgid,
because ipoib_neigh already exists. It will not have an ah either, because of
sender-side failover. Therefore, it will not get an ah when the path is resolved.
The solution here is to compare gids even if neigh->ah is invalid.
Signed-off-by: Moni Shoua <monis at voltaire.com>
Signed-off-by: Yossi Etigin <yosefe at voltaire.com>
---
Fix bugzilla 1286.
drivers/infiniband/ulp/ipoib/ipoib_main.c | 38 +++++++++++++++---------------
1 file changed, 19 insertions(+), 19 deletions(-)
Index: b/drivers/infiniband/ulp/ipoib/ipoib_main.c
===================================================================
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-12-15 19:53:16.000000000 +0200
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c 2008-12-15 19:53:37.000000000 +0200
@@ -687,26 +687,26 @@ static int ipoib_start_xmit(struct sk_bu
neigh = *to_ipoib_neigh(skb->dst->neighbour);
- if (neigh->ah)
- if (unlikely((memcmp(&neigh->dgid.raw,
- skb->dst->neighbour->ha + 4,
- sizeof(union ib_gid))) ||
- (neigh->dev != dev))) {
- spin_lock_irqsave(&priv->lock, flags);
- /*
- * It's safe to call ipoib_put_ah() inside
- * priv->lock here, because we know that
- * path->ah will always hold one more reference,
- * so ipoib_put_ah() will never do more than
- * decrement the ref count.
- */
+ if (unlikely((memcmp(&neigh->dgid.raw,
+ skb->dst->neighbour->ha + 4,
+ sizeof(union ib_gid))) ||
+ (neigh->dev != dev))) {
+ spin_lock_irqsave(&priv->lock, flags);
+ /*
+ * It's safe to call ipoib_put_ah() inside
+ * priv->lock here, because we know that
+ * path->ah will always hold one more reference,
+ * so ipoib_put_ah() will never do more than
+ * decrement the ref count.
+ */
+ if (neigh->ah)
ipoib_put_ah(neigh->ah);
- list_del(&neigh->list);
- ipoib_neigh_free(dev, neigh);
- spin_unlock_irqrestore(&priv->lock, flags);
- ipoib_path_lookup(skb, dev);
- return NETDEV_TX_OK;
- }
+ list_del(&neigh->list);
+ ipoib_neigh_free(dev, neigh);
+ spin_unlock_irqrestore(&priv->lock, flags);
+ ipoib_path_lookup(skb, dev);
+ return NETDEV_TX_OK;
+ }
if (ipoib_cm_get(neigh)) {
if (ipoib_cm_up(neigh)) {
--
--Yossi
More information about the general
mailing list