[openib-general] How about ib_send_page() ?

Wed May 18 15:08:27 PDT 2005

On Tue, May 17, 2005 at 07:48:32PM -0700, Roland Dreier wrote:
...
> NAPI isn't really a throughput optimization.  It pretty much only
> helps with small packet workloads.

In general, yes, I agree. NAPI prevents the NIC from saturating
the CPU with interrupts from small packets. It's also one defense
against DoS attacks.

But one side effect is RX descriptor cachelines don't pingpong as
much between IO controller and CPU. NAPI also helps throughput
by improving  PCI bus utilization.  This isn't an issue for GigE
unless one has a GigE link on PCI 33Mhz/32-bit PCI bus...in this
case RX is obviously heavily favored over TX. (workaround was
to reduce the RX descriptor ring size).  At least that's what
Jamal Hadi, Robert Olsson, and I concluded in a problem we
were trying to sort out last year.

I don't know where IB and 10GigE are WRT such problems.
But I expect people closer to the HW have bus traces and are
aware of similar issues.

> Also, it helps the RX path much
> more than the TX path, which means it doesn't help much in a symmetric
> test like netperf or NPtcp.

Is netperf TCP_STREAM test symmetric?
I know netperf can run symetric workloads (e.g. TCP_RR, 1 byte message)
but my understand was web server workloads typically are asymetric
(heavily loading TX). And most of the scripts provided with netperf
try asymetric parameters.

...
> When I profile a system running IPoIB throughput tests, half or more
> of the CPU time is going to skb_copy_and_csum() and other parts of the
> core kernel's network stack, so our ability to optimize IPoIB is
> somewhat limited.  I've already dealt with all of the easy
> optimization targets from profiling in IPoIB and mthca -- with MSI-X,
> we do zero MMIO reads, and Michael Tsirkin has heavily tuned our locking.

Ok.

thanks,
grant