[ofa-general] [PATCH/RFC] IPoIB: Don't drop multicast packets sent before group is joined

Roland Dreier rdreier at cisco.com
Tue Jun 2 15:18:46 PDT 2009


[I applied the patch below from Christoph, but I see now he sent it only
to me and didn't cc the mailing list... so for completeness I'm
including it below in case anyone else has concerns]

From: Christoph Lameter <cl at linux-foundation.org>

The IPoIB drops multicast packets if more than 3 are queued while a
multicast send is pending.  The send queue can easily contain more
than 3 packets and when the queue is processed only the first 3
packets will make it and the rest will be dropped.

The IPoIB layer is unable to send a high speed stream of multicast
traffic because of this.  Since the packets are dropped the sender is
never throttled and so continuous sending of data will lead to
continual packet loss.

To fix this, simply remove the code that drops packets, so the socket
queue will build up and the sender will be throttled if it is
continuously sending packets faster than the IPoIB layer can process
them.  This results in the IPoIB layer sending multicast traffic at
the highest rate allowed by the underlying hardware and makes the
IPoIB layer multicast behavior work like other ethernet NICs.  (And it
also means there's no risk of unbounded growth of pending packets,
since they are accounted for as part of socket memory)

We were able to send millions of UDP multicast messages at high rates
after this patch.  The tool for generating multicast traffic can be
found on http://www.gentwo.org/ll.

Signed-off-by: Christoph Lameter <cl at linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd at cisco.com>
---
 drivers/infiniband/ulp/ipoib/ipoib.h           |    1 -
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |    7 +------
 2 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index 753a983..ddb5cd7 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -79,7 +79,6 @@ enum {
 	IPOIB_NUM_WC		  = 4,
 
 	IPOIB_MAX_PATH_REC_QUEUE  = 3,
-	IPOIB_MAX_MCAST_QUEUE	  = 3,
 
 	IPOIB_FLAG_OPER_UP	  = 0,
 	IPOIB_FLAG_INITIALIZED	  = 1,
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index 425e311..e6ea76a 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -685,12 +685,7 @@ void ipoib_mcast_send(struct net_device *dev, void *mgid, struct sk_buff *skb)
 	}
 
 	if (!mcast->ah) {
-		if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE)
-			skb_queue_tail(&mcast->pkt_queue, skb);
-		else {
-			++dev->stats.tx_dropped;
-			dev_kfree_skb_any(skb);
-		}
+		skb_queue_tail(&mcast->pkt_queue, skb);
 
 		if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags))
 			ipoib_dbg_mcast(priv, "no address vector, "
-- 
1.6.0.4




More information about the general mailing list