[ofa-general] Re: [PATCH 0/10 REV5] Implement skb batching and support in IPoIB/E1000

Or Gerlitz ogerlitz at voltaire.com
Sun Sep 16 00:57:05 PDT 2007


> Krishna Kumar <krkumar2 at in.ibm.com> 
> date	   Aug 22, 2007 11:28 AM	 
> subject [PATCH 0/10 Rev4] Implement skb batching and support in IPoIB	 
> Issues:
> --------
> I am getting a huge amount of retransmissions for both TCP and TCP No Delay
> cases for IPoIB (which explains the slight degradation for some test cases
> mentioned in previous mail). After a full test run, there were 18500
> retransmissions for every 1 in regular code. But there is 20.7% overall
> improvement in BW even with this huge amount of retransmissions (which implies
> batching could improve results even more if this problem is fixed). Results of
> experiments are:
>        a. With batching set to maximum 2 skbs, I get almost the same number
>           of retransmissions (implies receiver probably is not dropping skbs).
>           ifconfig/netstat on receiver gives no clue (drop/errors, etc).
>        b. Making the IPoIB xmit create single work requests for each skb on
>           blist reduces retrans to same as in regular code.
>        c. Similar retransmission increase is not seen for E1000.


Krishna Kumar wrote:
> Issues:
> --------
> The retransmission problem reported earlier seems to happen when mthca is
> used as the underlying device, but when I tested ehca the retransmissions
> dropped to normal levels (around 2 times the regular code). The performance
> improvement is around 55% for TCP.

Hi,

So with ipoib/mthca you still see this 1 : 18.5K retransmission rate 
(with no noticeable retransmission increase for E1000) you were 
reporting at the V4 post?! if this is the case, I think it calls for 
further examination, where help from Mellanox could ease things, I guess.

By saying that with ehca you see "normal level retransmissions - 2 times 
the regular code" do you mean 1 : 2 retransmission rate between batching 
to no batching?

I am not sure this was mentioned over the threads, but clearly two sides 
are needed for the dance here, namely I think you want to do your tests 
(both the no batching and with batching) with something like NAPI 
enabled at the receiver side, 2.6.23-rc5 has NAPI


> ----------------------------------------------------
> 			TCP
> 			----
is this with no delay set or not? connected or datagram mode? mtu? 
netperf command? system spec (specifically hca device id and fw 
version), etc?

> Size:32 Procs:1		2728	3544	29.91
> Size:128 Procs:1	11803	13679	15.89
> Size:512 Procs:1	43279	49665	14.75
> Size:4096 Procs:1	147952	101246	-31.56
> Size:16384 Procs:1	149852	141897	-5.30
> 
> Size:32 Procs:4		10562	11349	7.45
> Size:128 Procs:4	41010	40832	-.43
> Size:512 Procs:4	75374	130943	73.72
> Size:4096 Procs:4	167996	368218	119.18
> Size:16384 Procs:4	123176	379524	208.11
> 
> Size:32 Procs:8		21125	21990	4.09
> Size:128 Procs:8	77419	78605	1.53
> Size:512 Procs:8	234678	265047	12.94
> Size:4096 Procs:8	218063	367604	68.57
> Size:16384 Procs:8	184283	370972	101.30
> 
> Average:	1509300 -> 2345115 = 55.38%




More information about the general mailing list