[ofa-general] RE: msi-x seems to cause large performance variation for connectx / datagram mode
Sagi Rotem
Sagir at mellanox.co.il
Wed Jun 25 07:16:17 PDT 2008
Do you get the same variance with the irqdaemon off ?
How does the interrupts on both cases spread between the cores ?
Sagi
-----Original Message-----
From: Or Gerlitz [mailto:ogerlitz at voltaire.com]
Sent: Wednesday, June 25, 2008 4:56 PM
To: Roland Dreier; Eli Cohen; Sagi Rotem
Cc: general at lists.openfabrics.org
Subject: msi-x seems to cause large performance variation for connectx /
datagram mode
While doing a synthetic network benchmark (netperf) of 64K message size,
on two nodes each with two HCAs: Arbel & ConnectX (where the idea was to
have an apples-to-apples comparison), I saw a nice advatange with
connectx, but there is kind of large variation in the performance it
gives.
The system spec is: RH5 x86_64 2.6.18-8.el5 SMP, 4GB RAM, Intel 1.6GHz
four CPUs (or two CPUS and two cores each, I don't know)
- I used datagram mode, where connectx has checksum and LSO offloads
- the code used is not the mainline kernel but rather
git://git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel
commit 39e1dc833f98e5134f91fcf7f33df402adf4bc0c
digging a little more, I see now that if I disable msi-x, the
performance gets to a fixed value of about 620MB/s with connectx which
is the lower range that msi-x gives me...
HCA FW MB/s msi_x
============================================
Arbel-Arbel 4.8.2 450 0 (ib_mthca)
Arbel-Arbel 4.8.2 450 1 (ib_mthca)
ConnectX-ConnectX 2.3 620-850 1 (mlx4_core)
ConnectX-ConnectX 2.3 620 0 (mlx4_core)
Any idea if there's something in the system settings that can explain
this?
I have pasted below some lspci info and netperf results, it might help.
Note that its FW 2.3 and not 2.5 on which I see other issues...
Or.
# modprobe mlx4_core msi_x=0
# lspci -v | grep -A 15 634a
03:00.0 InfiniBand: Mellanox Technologies Unknown device 634a (rev a0)
Subsystem: Mellanox Technologies Unknown device 634a
Flags: bus master, fast devsel, latency 0, IRQ 169
Memory at c8300000 (64-bit, non-prefetchable) [size=1M]
Memory at c9800000 (64-bit, prefetchable) [size=8M]
Memory at c8200000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [40] Power Management version 3
Capabilities: [48] Vital Product Data
Capabilities: [9c] MSI-X: Enable- Mask- TabSize=256
Capabilities: [60] Express Endpoint IRQ 0
# modprobe mlx4_core msi_x=1
# lspci -v | grep -A 10 634a
03:00.0 InfiniBand: Mellanox Technologies Unknown device 634a (rev a0)
Subsystem: Mellanox Technologies Unknown device 634a
Flags: bus master, fast devsel, latency 0, IRQ 169
Memory at c8300000 (64-bit, non-prefetchable) [size=1M]
Memory at c9800000 (64-bit, prefetchable) [size=8M]
Memory at c8200000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [40] Power Management version 3
Capabilities: [48] Vital Product Data
Capabilities: [9c] MSI-X: Enable+ Mask- TabSize=256
Capabilities: [60] Express Endpoint IRQ 0
- Connectx / Connectx msi_x = 1
# netperf -H 192.168.2.152 -D 1, -l 600 -fM -t TCP_STREAM -- -m 64000
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.152
(192.168.2.152) port 0 AF_INET : demo
Interim result: 648.10 MBytes/s over 1.02 seconds Interim result:
630.23 MBytes/s over 1.03 seconds Interim result: 629.91 MBytes/s over
1.00 seconds Interim result: 630.24 MBytes/s over 1.00 seconds Interim
result: 630.08 MBytes/s over 1.00 seconds Interim result: 630.40
MBytes/s over 1.00 seconds Interim result: 630.19 MBytes/s over 1.00
seconds Interim result: 629.88 MBytes/s over 1.00 seconds Interim
result: 630.18 MBytes/s over 1.00 seconds Interim result: 630.17
MBytes/s over 1.00 seconds Interim result: 646.35 MBytes/s over 1.00
seconds Interim result: 659.88 MBytes/s over 1.00 seconds Interim
result: 662.61 MBytes/s over 1.00 seconds Interim result: 663.06
MBytes/s over 1.00 seconds Interim result: 662.32 MBytes/s over 1.00
seconds Interim result: 661.93 MBytes/s over 1.00 seconds Interim
result: 662.78 MBytes/s over 1.00 seconds Interim result: 662.59
MBytes/s over 1.00 seconds Interim result: 664.89 MBytes/s over 1.00
seconds Interim result: 865.67 MBytes/s over 1.00 seconds Interim
result: 726.10 MBytes/s over 1.19 seconds Interim result: 623.57
MBytes/s over 1.16 seconds Interim result: 623.10 MBytes/s over 1.00
seconds Interim result: 622.99 MBytes/s over 1.00 seconds Interim
result: 623.47 MBytes/s over 1.00 seconds Interim result: 622.95
MBytes/s over 1.00 seconds Interim result: 623.52 MBytes/s over 1.00
seconds Interim result: 623.26 MBytes/s over 1.00 seconds Interim
result: 623.06 MBytes/s over 1.00 seconds Interim result: 623.40
MBytes/s over 1.00 seconds Interim result: 825.80 MBytes/s over 1.00
seconds Interim result: 862.10 MBytes/s over 1.00 seconds Interim
result: 862.41 MBytes/s over 1.00 seconds Interim result: 862.98
MBytes/s over 1.00 seconds Interim result: 862.83 MBytes/s over 1.00
seconds Interim result: 862.66 MBytes/s over 1.00 seconds Interim
result: 862.08 MBytes/s over 1.00 seconds Interim result: 861.96
MBytes/s over 1.00 seconds Interim result: 861.52 MBytes/s over 1.00
seconds Interim result: 861.32 MBytes/s over 1.00 seconds Interim
result: 650.12 MBytes/s over 1.32 seconds Interim result: 623.67
MBytes/s over 1.04 seconds Interim result: 623.60 MBytes/s over 1.00
seconds Interim result: 623.16 MBytes/s over 1.00 seconds Interim
result: 623.38 MBytes/s over 1.00 seconds Interim result: 623.34
MBytes/s over 1.00 seconds Interim result: 622.93 MBytes/s over 1.00
seconds Interim result: 622.67 MBytes/s over 1.00 seconds Interim
result: 623.09 MBytes/s over 1.00 seconds Interim result: 674.60
MBytes/s over 1.00 seconds Interim result: 862.36 MBytes/s over 1.00
seconds Interim result: 862.32 MBytes/s over 1.00 seconds Interim
result: 862.60 MBytes/s over 1.00 seconds Interim result: 862.77
MBytes/s over 1.00 seconds Interim result: 862.42 MBytes/s over 1.00
seconds Interim result: 862.97 MBytes/s over 1.00 seconds Interim
result: 862.56 MBytes/s over 1.00 seconds Interim result: 863.27
MBytes/s over 1.00 seconds Interim result: 862.66 MBytes/s over 1.00
seconds Interim result: 794.91 MBytes/s over 1.09 seconds Interim
result: 622.52 MBytes/s over 1.28 seconds Interim result: 622.43
MBytes/s over 1.00 seconds Interim result: 622.68 MBytes/s over 1.00
seconds Interim result: 622.51 MBytes/s over 1.00 seconds Interim
result: 622.34 MBytes/s over 1.00 seconds Interim result: 622.11
MBytes/s over 1.00 seconds Interim result: 622.11 MBytes/s over 1.00
seconds Interim result: 621.71 MBytes/s over 1.00 seconds Interim
result: 621.85 MBytes/s over 1.00 seconds Interim result: 761.27
MBytes/s over 1.00 seconds Interim result: 861.58 MBytes/s over 1.00
seconds Interim result: 861.76 MBytes/s over 1.00 seconds Interim
result: 861.10 MBytes/s over 1.00 seconds Interim result: 862.02
MBytes/s over 1.00 seconds Interim result: 861.63 MBytes/s over 1.00
seconds
- Arbel/Arbel msi_x = 0
# netperf -H 192.168.1.152 -D 1, -l 600 -fM -t TCP_STREAM -- -m 64000
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.152
(192.168.1.152) port 0 AF_INET : demo
Interim result: 464.73 MBytes/s over 1.00 seconds Interim result:
462.77 MBytes/s over 1.00 seconds Interim result: 463.02 MBytes/s over
1.00 seconds Interim result: 463.07 MBytes/s over 1.00 seconds Interim
result: 462.93 MBytes/s over 1.00 seconds Interim result: 462.85
MBytes/s over 1.00 seconds Interim result: 462.83 MBytes/s over 1.00
seconds Interim result: 463.21 MBytes/s over 1.00 seconds Interim
result: 464.18 MBytes/s over 1.00 seconds Interim result: 464.03
MBytes/s over 1.00 seconds Interim result: 464.59 MBytes/s over 1.00
seconds Interim result: 464.19 MBytes/s over 1.00 seconds Interim
result: 464.56 MBytes/s over 1.00 seconds Interim result: 464.50
MBytes/s over 1.00 seconds Interim result: 464.45 MBytes/s over 1.00
seconds Interim result: 464.44 MBytes/s over 1.00 seconds Interim
result: 464.38 MBytes/s over 1.00 seconds Interim result: 464.22
MBytes/s over 1.00 seconds Interim result: 464.30 MBytes/s over 1.00
seconds Interim result: 463.98 MBytes/s over 1.00 seconds Interim
result: 464.34 MBytes/s over 1.00 seconds Interim result: 463.79
MBytes/s over 1.00 seconds Interim result: 464.23 MBytes/s over 1.00
seconds Interim result: 464.40 MBytes/s over 1.00 seconds Interim
result: 464.18 MBytes/s over 1.00 seconds Interim result: 464.78
MBytes/s over 1.00 seconds Interim result: 464.59 MBytes/s over 1.00
seconds Interim result: 464.71 MBytes/s over 1.00 seconds Interim
result: 464.40 MBytes/s over 1.00 seconds Interim result: 463.94
MBytes/s over 1.00 seconds Interim result: 464.40 MBytes/s over 1.00
seconds Interim result: 464.39 MBytes/s over 1.00 seconds Interim
result: 464.65 MBytes/s over 1.00 seconds Interim result: 464.49
MBytes/s over 1.00 seconds Interim result: 464.14 MBytes/s over 1.00
seconds Interim result: 464.45 MBytes/s over 1.00 seconds Interim
result: 464.66 MBytes/s over 1.00 seconds Interim result: 464.64
MBytes/s over 1.00 seconds Interim result: 464.23 MBytes/s over 1.00
seconds Interim result: 464.20 MBytes/s over 1.00 seconds Interim
result: 464.16 MBytes/s over 1.00 seconds Interim result: 463.59
MBytes/s over 1.00 seconds Interim result: 464.33 MBytes/s over 1.00
seconds Interim result: 464.07 MBytes/s over 1.00 seconds Interim
result: 463.66 MBytes/s over 1.00 seconds
- Connectx / Connectx msi_x = 0
# netperf -H 192.168.2.152 -D 1, -l 600 -fM -t TCP_STREAM -- -m 64000
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.152
(192.168.2.152) port 0 AF_INET : demo
Interim result: 627.10 MBytes/s over 1.00 seconds Interim result:
626.37 MBytes/s over 1.00 seconds Interim result: 626.76 MBytes/s over
1.00 seconds Interim result: 626.71 MBytes/s over 1.00 seconds Interim
result: 626.54 MBytes/s over 1.00 seconds Interim result: 626.86
MBytes/s over 1.00 seconds Interim result: 626.68 MBytes/s over 1.00
seconds Interim result: 626.63 MBytes/s over 1.00 seconds Interim
result: 627.74 MBytes/s over 1.00 seconds Interim result: 628.28
MBytes/s over 1.00 seconds Interim result: 627.92 MBytes/s over 1.00
seconds Interim result: 627.79 MBytes/s over 1.00 seconds Interim
result: 628.00 MBytes/s over 1.00 seconds Interim result: 627.68
MBytes/s over 1.00 seconds Interim result: 627.32 MBytes/s over 1.00
seconds Interim result: 627.51 MBytes/s over 1.00 seconds Interim
result: 627.72 MBytes/s over 1.00 seconds Interim result: 627.67
MBytes/s over 1.00 seconds Interim result: 627.39 MBytes/s over 1.00
seconds Interim result: 627.24 MBytes/s over 1.00 seconds Interim
result: 627.06 MBytes/s over 1.00 seconds Interim result: 627.00
MBytes/s over 1.00 seconds Interim result: 627.24 MBytes/s over 1.00
seconds Interim result: 627.87 MBytes/s over 1.00 seconds Interim
result: 622.81 MBytes/s over 1.01 seconds Interim result: 623.41
MBytes/s over 1.00 seconds Interim result: 623.24 MBytes/s over 1.00
seconds Interim result: 623.51 MBytes/s over 1.00 seconds Interim
result: 623.01 MBytes/s over 1.00 seconds Interim result: 623.16
MBytes/s over 1.00 seconds Interim result: 622.87 MBytes/s over 1.00
seconds Interim result: 622.58 MBytes/s over 1.00 seconds Interim
result: 623.25 MBytes/s over 1.00 seconds Interim result: 622.91
MBytes/s over 1.00 seconds Interim result: 622.96 MBytes/s over 1.00
seconds Interim result: 622.93 MBytes/s over 1.00 seconds Interim
result: 622.85 MBytes/s over 1.00 seconds Interim result: 623.21
MBytes/s over 1.00 seconds Interim result: 622.83 MBytes/s over 1.00
seconds Interim result: 622.61 MBytes/s over 1.00 seconds Interim
result: 622.99 MBytes/s over 1.00 seconds Interim result: 622.83
MBytes/s over 1.00 seconds Interim result: 622.68 MBytes/s over 1.00
seconds Interim result: 622.67 MBytes/s over 1.00 seconds Interim
result: 622.59 MBytes/s over 1.00 seconds Interim result: 623.04
MBytes/s over 1.00 seconds Interim result: 623.28 MBytes/s over 1.00
seconds Interim result: 622.82 MBytes/s over 1.00 seconds Interim
result: 622.69 MBytes/s over 1.00 seconds Interim result: 622.62
MBytes/s over 1.00 seconds Interim result: 622.47 MBytes/s over 1.00
seconds Interim result: 622.47 MBytes/s over 1.00 seconds Interim
result: 623.14 MBytes/s over 1.00 seconds Interim result: 622.68
MBytes/s over 1.00 seconds Interim result: 622.61 MBytes/s over 1.00
seconds Interim result: 622.59 MBytes/s over 1.00 seconds
- Arbel/Arbel msi_x = 1
# netperf -H 192.168.1.152 -D 1, -l 600 -fM -t TCP_STREAM -- -m 64000
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.152
(192.168.1.152) port 0 AF_INET : demo
Interim result: 457.39 MBytes/s over 1.00 seconds Interim result:
456.53 MBytes/s over 1.00 seconds Interim result: 456.67 MBytes/s over
1.00 seconds Interim result: 456.79 MBytes/s over 1.00 seconds Interim
result: 456.52 MBytes/s over 1.00 seconds Interim result: 456.26
MBytes/s over 1.00 seconds Interim result: 456.78 MBytes/s over 1.00
seconds Interim result: 456.49 MBytes/s over 1.00 seconds Interim
result: 456.33 MBytes/s over 1.00 seconds Interim result: 456.32
MBytes/s over 1.00 seconds Interim result: 456.75 MBytes/s over 1.00
seconds Interim result: 456.75 MBytes/s over 1.00 seconds Interim
result: 457.30 MBytes/s over 1.00 seconds Interim result: 456.40
MBytes/s over 1.00 seconds Interim result: 456.60 MBytes/s over 1.00
seconds Interim result: 456.39 MBytes/s over 1.00 seconds Interim
result: 456.53 MBytes/s over 1.00 seconds Interim result: 456.32
MBytes/s over 1.00 seconds Interim result: 456.13 MBytes/s over 1.00
seconds Interim result: 456.58 MBytes/s over 1.00 seconds Interim
result: 456.71 MBytes/s over 1.00 seconds Interim result: 456.39
MBytes/s over 1.00 seconds Interim result: 457.01 MBytes/s over 1.00
seconds Interim result: 456.41 MBytes/s over 1.00 seconds Interim
result: 445.10 MBytes/s over 1.03 seconds Interim result: 444.25
MBytes/s over 1.00 seconds Interim result: 445.65 MBytes/s over 1.00
seconds Interim result: 446.40 MBytes/s over 1.00 seconds Interim
result: 445.32 MBytes/s over 1.00 seconds Interim result: 445.59
MBytes/s over 1.00 seconds Interim result: 445.71 MBytes/s over 1.00
seconds Interim result: 445.84 MBytes/s over 1.00 seconds Interim
result: 445.74 MBytes/s over 1.00 seconds Interim result: 445.70
MBytes/s over 1.00 seconds Interim result: 445.22 MBytes/s over 1.00
seconds Interim result: 445.22 MBytes/s over 1.00 seconds Interim
result: 445.44 MBytes/s over 1.00 seconds Interim result: 445.72
MBytes/s over 1.00 seconds Interim result: 445.80 MBytes/s over 1.00
seconds Interim result: 445.90 MBytes/s over 1.00 seconds Interim
result: 445.87 MBytes/s over 1.00 seconds Interim result: 445.89
MBytes/s over 1.00 seconds Interim result: 445.33 MBytes/s over 1.00
seconds Interim result: 445.36 MBytes/s over 1.00 seconds Interim
result: 445.83 MBytes/s over 1.00 seconds Interim result: 445.66
MBytes/s over 1.00 seconds Interim result: 445.62 MBytes/s over 1.00
seconds Interim result: 445.90 MBytes/s over 1.00 seconds Interim
result: 445.84 MBytes/s over 1.00 seconds Interim result: 445.51
MBytes/s over 1.00 seconds Interim result: 446.18 MBytes/s over 1.00
seconds Interim result: 446.01 MBytes/s over 1.00 seconds Interim
result: 445.87 MBytes/s over 1.00 seconds Interim result: 445.56
MBytes/s over 1.00 seconds Interim result: 446.26 MBytes/s over 1.00
seconds Interim result: 445.56 MBytes/s over 1.00 seconds Interim
result: 445.89 MBytes/s over 1.00 seconds Interim result: 445.73
MBytes/s over 1.00 seconds Interim result: 445.56 MBytes/s over 1.00
seconds Interim result: 445.98 MBytes/s over 1.00 seconds Interim
result: 446.15 MBytes/s over 1.00 seconds
More information about the general
mailing list