[ofw] Unstable operation of WinOFED 4.40.0 with HCA fw 2.30.3200 on Win 2012 Server

nicholas ferguso nicholasferguson at wingarch.com
Mon Dec 30 07:06:17 PST 2013


Have you researched ICMP policies?

 

From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Alexey Novozhilov
Sent: Monday, December 30, 2013 12:29 AM
To: ofw at lists.openfabrics.org
Subject: [ofw] Unstable operation of WinOFED 4.40.0 with HCA fw 2.30.3200 on
Win 2012 Server

 

Hi,

 

I've encountered an issue of unstable operation of WinOFED 4.40.0 with HCA
firmware 2.30.3200 on Windows 2012 Server.

Last week I'm going mad trying to get it working well.

 

I have a system of two Mellanox Infiniscale-IV IS5023 switches, three hosts
running Windows 2012 Server / Windows 2012 Server R2 and two hosts running
Ubuntu Linux 12.04 LTS. All hosts are equiped with Connect-X3 VPI Mellanox
network cards ( MCX354A-QCBT ). Each host is plugged into both switches,
swithes are connected to each other and use shared fabric. All hosts are
based on SuperMicro X9DRW-7TPF mainboards with Intel Xeon E5-2667 v2 CPUs
and DDR3-1866 memory.

WinOFED 4.40.0 in installed on Win Server 2012, WinOFED 4.55 on Win Server
2012R2. Both linux hosts are routers
MLNX_OFED_LINUX-2.0-3.0.0-ubuntu12.04-x86_64 packet installed, IB interfaces
are joined into active-backup bond by ifenslave means. Following modules are
loaded on Linux routers: mlx4_core, mlx4_ib, b_umad, ib_mad, ib_ipoib,
ib_uverbs. All HCAs are burned with 2.30.3200 firmware.

As test I use L3 icmp ping. In case of Linux - Linux communications all is
fine. I'm doing flood ping through Infiniband network with amazing results:
rtt min/avg/max/mdev = 0.011/0.013/1.682/0.002 ms. But I see different when
it goes to Windows.

Flood ping from linux host to w2012 gives almost same good latency numbers
(rtt min/avg/max/mdev = 0.022/0.024/2.492/0.021 ms), but packet loss rate is
always about 1-2%. At same time, IBping shows no packet loss at all and
ibdiagnet on Linux show no warnings or errors, so I conclude IB works good
and issue exist higher than L2.

So I've decided to try Win2012R2 with 4.55 OFED version. It resolved issue
with packet loss, but also gave latency growth: rtt min/avg/max/mdev =
0.093/0.102/17.550/0.101 ms. Digging this issue I've found that in other
system I have no issues like it, and the difference is a firmware version of
HCA. Win2012 with OFED 4.40.0 goes fine with firmware 2.11.500. But
downgrading to fw 2.11.500 on my servers didn't help, with all same versions
of fw and software still I see packet loss.

Still I want it all together - low latency, no packet loss, and latest
software and firmware versions.

 

Running out of ideas about it, any comments and advises are appreciated.

 

--

Regards,

Alexey

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20131230/bb07e588/attachment.html>


More information about the ofw mailing list