[ofa-general] Re: [PATCH 0/10 REV5] Implement skb batching and support in IPoIB/E1000

Krishna Kumar2 krkumar2 at in.ibm.com
Sun Sep 16 21:35:22 PDT 2007


Hi Or,

> So with ipoib/mthca you still see this 1 : 18.5K retransmission rate
> (with no noticeable retransmission increase for E1000) you were
> reporting at the V4 post?! if this is the case, I think it calls for
> further examination, where help from Mellanox could ease things, I guess.

What I will do today/tomorrow is to run the rev5 (which I didn't run
for mthca) on both ehca and mthca and get statistics and send it out.
Otherwise what you stated is correct as far as rev4 goes. After giving
latest details, I will appreciate any help from Mellanox developers.

> By saying that with ehca you see "normal level retransmissions - 2 times
> the regular code" do you mean 1 : 2 retransmission rate between batching
> to no batching?

Correct, for every 1 retransmission in the regular code, I see two
retransmissions in batching case (which I assume is due to overflow at the
receiver side as I batch sometimes upto 4K skbs). I will post the exact
numbers in the next post.

> I am not sure this was mentioned over the threads, but clearly two sides
> are needed for the dance here, namely I think you want to do your tests
> (both the no batching and with batching) with something like NAPI
> enabled at the receiver side, 2.6.23-rc5 has NAPI

I was using 2.6.23-rc1 on receiver (which also has NAPI, but uses the
old API - the same fn ipoib_poll()).

> is this with no delay set or not? connected or datagram mode? mtu?
> netperf command? system spec (specifically hca device id and fw
> version), etc?

This is TCP (without No Delay), datagram mode, I didn't change mtu from
the default (is it 2K?).

Command is iperf with various options for different test
buffer-size/threads.

Regarding id/etc, this is what dmesg has:

Sep 16 22:49:26 elm3b39 kernel: eHCA Infiniband Device Driver (Rel.:
SVNEHCA_0023)
Sep 16 22:49:26 elm3b39 kernel: xics_enable_irq: irq=36868: ibm_int_on
returned -3

There are *fw* files for mthca0, but I don't see for ehca in /sys/class, so
I am not sure (since these are pci-e cards, nothing shows up in lspci -v).
What should I look for?

Thanks,

- KK




More information about the general mailing list