[ofa-general] Poor Performance of OpenIB with small packets c.f. Gigabit Ethernet

David Robb DavidRobb at comsci.co.uk
Wed Oct 29 12:52:49 PDT 2008


Some further info that may provide some clues

The transfer rate appears to be very sensitive to the socket recv/send 
buffer size settings.

Using the default buffer sizes rather than our settings of 128K for send 
and 256K for recv has increased the v1.3 IPoIB transfer rate to ~ 4MB/s.

Leaving NAGLE algorithm enabled by not setting TCP_NODELAY on the socket 
further increases the rate to ~7.8MB/s.

Looking at our timing logs it appears that with TCP_NODELAY set, the 
socket send call returns EAGAIN before any amount of data is queued to 
the send buffer?

Also, we are seeing the occasional glitch where our Comms layer stalls 
waiting in an epoll_wait on recv for ~ 200ms. (Replacing the epoll_wait 
with a polled loop shows that the socket really has no data available 
for this time)

Could it be that we depleting the work requests and hence triggering a 
'Not Ready' at the receiving end? If so, how much delay would this cause?

Are any of these values configurable when using IPoIB?

When using Ethernet, we need to set TCP_NODELAY to avoid latency on the 
last part of messages. What affect does this setting have when using 
IPoIB? (It appears to prevent us from filling up the socket send buffer. 
But is it even required when using low latency Infiniband?)

I would be very grateful if someone with greater inside knowledge of 
this could provide some diagnosis here.

TIA

Dave Robb


David Robb wrote:
> We have a data logging application that exhibits poor performance when 
> operated using TCP/IP sockets and IPoIB.
>
> With small message sizes ~ 64 bytes, the performance values for our 
> application are
>
> OFED 1.2 IPoIB:     2.81MB/s
> OFED 1.3 IPoIB:     1.37MB/s
> GB Ethernet:            5.38MB/s
>
> It is not until the message sizes reach 16K or so that the Infiniband 
> starts to overtake the Ethernet.
>
> Are these values as expected?
>
> What further tests could I run to investigate the problem?
>
> Are there any settings and or device configuration that we can tweak 
> to improve the small message performance?
>
> We are running RH-EL Linux and using Mellanox HCAs and switches.
> We recently upgrade to OFED 1.3 and have upgraded the HCA firmware to 
> the latest 1.2 version.
>
> Many thanks for any help
>
> Regards
>
> David Robb
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
>



More information about the general mailing list