[openib-general] ip over ib throughtput

Tue May 9 18:58:09 PDT 2006

At 05:47 PM 5/9/2006, Shirley Ma wrote:

>Thanks for sharing these test results. 
>
>The netperf/netserver IPoIB over UD mode test spent most of time on copying data from user to kernel + checksum(csum_partial_copy_generic), and it only can send no more than mtu=2044 ib_post_send() per wiki, which definitely limits its performance compared to RDMA read/write. I would expect NFS/RDMA throughput much better than IPoIB over UD. 

Actually, I got excellent results in regular cached mode too, which
results in one data copy from the file page cache to user space. (In
NFS O_DIRECT, the RDMA is targeted at the user pages, bypassing
the cache and yielding zero-copy zero-touch even though the I/O is
kernel mediated by the NFS stack.)

Throughput remains as high as in the direct case (because it's still
not CPU limited), and utilization rises to a number less than you might
expect - 65%. Specifically, the cached i/o test used 79us/32KB, and
the direct i/o used 56us/32KB.

Of course, the NFS/RDMA copies do not need to compute the checksum,
so they are more efficient than the socket atop IPoIB. But I am not
sure that the payload per WQE is important. We are nowhere near the
op rate of the adapter. I think the more important factor is the interrupt
rate. NFS/RDMA allows the client to take a single interrupt (the server
reply) after all RDMA has occurred. Also, the client uses unsignalled
completion on as many sends as possible. I believe I measured 0.7
interrupts per NFS op in my tests.

Well, I have been very pleased with the results so far! We'll have more
detail as we go.

Tom.