[ofa-general] [PATCH] IB/ipoib: copy small SKBs in CM mode

Eli Cohen eli at mellanox.co.il
Thu May 22 08:40:15 PDT 2008


>From a8ea680caf189ad984aedaa81463ed66e45c4e65 Mon Sep 17 00:00:00 2001
From: Eli Cohen <eli at mellanox.co.il>
Date: Thu, 22 May 2008 16:28:59 +0300
Subject: [PATCH] IB/ipoib: copy small SKBs in CM mode

CM mode of ipoib has a large overhead in the receive flow for managing
SKBs. It usually allocates an SKB with data as much as was used in the
currently received SKB  and moves unused fragments from the old SKB to the
new one. This involves a loop on all the remaining fragments and incurs
overhead on the CPU.
This patch, for small SKBs, allocates an SKB just large enough to contain
the received data and copies to it the data from the received SKB.
The newly allocated SKB is passed to the stack and the old SKB is reposted.

Signed-off-by: Eli Cohen <eli at mellanox.co.il>
---

When running netperf I see significant improvement when using this patch
(BW Mbps):

with patch:
sender		receiver
313		313

without the patch:
509		134

 drivers/infiniband/ulp/ipoib/ipoib.h    |    1 +
 drivers/infiniband/ulp/ipoib/ipoib_cm.c |   15 +++++++++++++++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index ca126fc..e39bf36 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -97,6 +97,7 @@ enum {
 	IPOIB_MCAST_FLAG_ATTACHED = 3,
 
 	MAX_SEND_CQE		  = 16,
+	SKB_TSHOLD		  = 256,
 };
 
 #define	IPOIB_OP_RECV   (1ul << 31)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index e6f57dd..791bef7 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -525,6 +525,7 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
 	u64 mapping[IPOIB_CM_RX_SG];
 	int frags;
 	int has_srq;
+	struct sk_buff *small_skb;
 
 	ipoib_dbg_data(priv, "cm recv completion: id %d, status: %d\n",
 		       wr_id, wc->status);
@@ -579,6 +580,19 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
 		}
 	}
 
+	if (wc->byte_len < SKB_TSHOLD) {
+		int dlen = wc->byte_len;
+
+		small_skb = dev_alloc_skb(dlen + 12);
+		if (small_skb) {
+			skb_reserve(small_skb, 12);
+			skb_copy_from_linear_data(skb, small_skb->data, dlen);
+			skb_put(small_skb, dlen);
+			skb = small_skb;
+			goto copied;
+		}
+	}
+
 	frags = PAGE_ALIGN(wc->byte_len - min(wc->byte_len,
 					      (unsigned)IPOIB_CM_HEAD_SIZE)) / PAGE_SIZE;
 
@@ -601,6 +615,7 @@ void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
 
 	skb_put_frags(skb, IPOIB_CM_HEAD_SIZE, wc->byte_len, newskb);
 
+copied:
 	skb->protocol = ((struct ipoib_header *) skb->data)->proto;
 	skb_reset_mac_header(skb);
 	skb_pull(skb, IPOIB_ENCAP_LEN);
-- 
1.5.5.1






More information about the general mailing list