[openib-general] How about ib_send_page() ?

Jeff Carr jcarr at linuxmachines.com
Wed May 18 14:41:12 PDT 2005


Roland Dreier wrote:

> The most interesting optimization available is implementing the IPoIB
> connected mode draft, although I don't think it's as easy as Vivek
> indicated -- for example, I'm not sure how to deal with having
> different MTUs depending on the destination.

Thank you for that reference. I'll read that now.

> NAPI isn't really a throughput optimization.  It pretty much only
> helps with small packet workloads.  

I'm not so sure. Maybe 2 or 4K == small in the IB case.

 > Also, it helps the RX path much
> more than the TX path, which means it doesn't help much in a symmetric
> test like netperf or NPtcp.  If you look at the table at the beginning
> of Documentation/networking/NAPI_HOWTO.txt, you can see that for e1000
> with 1024 byte packets, the NIC generates 872K RX interrupts for 1M RX
> packets -- NAPI doesn't kick in totally until the packet size is down
> to 256 bytes.

Yes, I'd be willing to guess they did these tests on a fast enough CPU 
so that by the time they got to the 512B test the CPU was able to handle 
things. By 1KB, the number of packets they could send got bound by the 
1gb cable; the cpu was idle enough that NAPI stopped being triggered and 
was able to handle 80k interrupts/sec.

I don't want to just generate noise here. I just was curious as to the 
causes of IPoIB performance being CPU bound.

> It is possible although somewhat ugly to implement NAPI for IPoIB, and
> I actually had hacked up such a patch a while ago.  However it didn't
> end up helping so I never pursued it.

Bummer.

> When I profile a system running IPoIB throughput tests, half or more
> of the CPU time is going to skb_copy_and_csum() and other parts of the
> core kernel's network stack, so our ability to optimize IPoIB is
> somewhat limited.  I've already dealt with all of the easy
> optimization targets from profiling in IPoIB and mthca -- with MSI-X,
> we do zero MMIO reads, and Michael Tsirkin has heavily tuned our locking.

Oh ya, that would be a problem if IPoIB is generating and verifying a 
software checksum for each packet.

IMHO, from my novice view of IB so far, I don't see why IB didn't build 
on top of ethernet more. That's one thing I still don't get.

Jeff



More information about the general mailing list