[ofw] (no subject)

Alexey Novozhilov alexey at ety.ru
Tue Mar 31 04:53:52 PDT 2015


Greetings, community.

Recently I've noticed traffic loss in my IP over IB network environment.
I'm using servers equipped with MCX354A-QCBT under Debian 7.6 with
MLNX_OFED_LINUX-2.4-1.0.4 and MCX353A-QCBT under Windows 2012 Server
with WinOF 4.90. All cards are burned with 2.33.5000 fw.
Hosts are connected to InfiniScale IS5023 switch.

Packet loss happens when tx rate is over 1001p/s and packet size is
less than 2002 bytes. As test I run some ping series between Linux and
Windows hosts. They show following:

IPoIB,  rate = 10k p/s, over 10k packers are sent. Packet size > 2002.
No loss.
ping -q  -i 0.0001 -c 100000 -s 2003 192.168.*
100000 packets transmitted, 100000 received, 0% packet loss, time
5310ms rtt min/avg/max/mdev = 0.035/0.046/3.765/0.098 ms, ipg/ewma
0.053/0.043 ms

IPoIB,  rate = 10k p/s, over 10k packers are sent. Packet size = 2002.
I see loss.
ping -q  -i 0.0001 -c 100000 -s 2002 192.168.*
100000 packets transmitted, 98795 received, 1% packet loss, time
18688ms rtt min/avg/max/mdev = 0.023/0.034/8.920/0.106 ms, ipg/ewma
0.186/0.156 ms

IPoIB,  rate = 10k p/s, over 10k packers are sent. Packet size = 2002.
I see loss.
ping -q  -i 0.0001 -c 100000 -s 2002 192.168.*
100000 packets transmitted, 99255 received, 0% packet loss, time
13153ms rtt min/avg/max/mdev = 0.024/0.035/8.636/0.103 ms, ipg/ewma
0.131/0.025 ms

IPoIB,  rate = 10k p/s, over 10k packers are sent. Packet size > 2002.
No loss.
ping -q  -i 0.0001 -c 100000 -s 2003 192.168.*
100000 packets transmitted, 100000 received, 0% packet loss, time
9278ms rtt min/avg/max/mdev = 0.074/0.085/9.890/0.108 ms, ipg/ewma
0.092/0.076 ms

IPoIB,  rate is 10k p/s, but less than 10k packets are sent. No loss.
ping -i 0.0001 192.168.* -q -c 1000
1000 packets transmitted, 1000 received, 0% packet loss, time 45ms rtt
min/avg/max/mdev = 0.031/0.040/1.065/0.086 ms, ipg/ewma 0.045/0.033 ms

IPoIB,  rate is 10k p/s, 10k packets are sent. Loss happens again.
ping -i 0.0001 192.168.* -q -c 10000
10000 packets transmitted, 9842 received, 1% packet loss, time 2334ms
rtt min/avg/max/mdev = 0.020/0.038/1.072/0.088 ms, ipg/ewma
0.233/0.023 ms

IPoIB,  rate is 10k p/s, over 10k packets are sent. Packet size is
standard. Loss detected.
ping -q -i 0.0001 -c 100000 192.168.*
100000 packets transmitted, 98167 received, 1% packet loss, time 25631ms

IPoIB,  rate is 1k p/s, 10k packets are sent. Standard packet size.
ping -i 0.001 -c 10000 -q 192.168.*
10000 packets transmitted, 10000 received, 0% packet loss, time 9998ms
rtt min/avg/max/mdev = 0.021/0.025/0.334/0.010 ms

IPoIB,  rate is higher than 1k p/s, 10k packets are sent. Standard packet size.
ping -i 0.0009 -c 10000 -q 192.168.*
10000 packets transmitted, 9865 received, 1% packet loss, time 2126ms
rtt min/avg/max/mdev = 0.021/0.036/1.070/0.088 ms, ipg/ewma
0.212/0.022 ms


Also I have a twin linux server. I see no packet loss during tests
between Linux and Linux:
IPoIB, rate is higher than 1k p/s, -c
ping -i 0.0009 -c 10000 -q 192.168.*
10000 packets transmitted, 10000 received, 0% packet loss, time 311ms
rtt min/avg/max/mdev = 0.015/0.026/1.398/0.096 ms, ipg/ewma 0.031/0.017 ms


All of these servers are also connected to same 1G Ethernet switch.
While running equal test in Ethrenet environment I see no packet loss at
all whatever with no dependence on OS:
Ethernet, rate > 1k p/s, 10k packets are sent, standard packet size.
ping -i 0.0009 -c 10000 -q 192.168.**
10000 packets transmitted, 10000 received, 0% packet loss, time 750ms
rtt min/avg/max/mdev = 0.039/0.065/0.632/0.013 ms, ipg/ewma 0.075/0.066 ms

Please feel free to share ideas about this issue.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20150331/c1dbeba7/attachment.html>


More information about the ofw mailing list