[ofa-general] Multicast Performance
Marcel Heinz
marcel.heinz at informatik.tu-chemnitz.de
Fri May 23 08:26:41 PDT 2008
Hi,
I have ported an application to use InfiniBand multicast directly via
libibverbs. I have discovered very low multicast throughput, only
~250MByte/s although we are using 4x DDR components. To count out any
effects of the application, I've created a small benchmark (well, it's
only a hack). It just tries to keep the send/recv queue filled with work
request and polls the CQ in an endless loop. In server mode, it joins
to/creates the multicast group as FullMember, attaches the QP to the
group and receives any packets. The client joins as SendOnlyNonMember
and sends Datagrams of full MTU size to the group.
The test setup is as follows:
Host A <---> Switch <---> Host B
We use Mellanox InfiniHost III Lx HCAs (MT25204) and a Flextronics
F-X430046 24-Port Switch, OFED 1.3 and a "vanilla" 2.6.23.9 Linux kernel.
The results are:
Host A Host B Throughput (MByte/sec)
client server 262
client 2xserver 146
client+server server 944
client+server --- 946
as reference: unicast ib_send_bw (in UD mode): 1146
I don't see any reason why it should become _faster_ when I additionally
start a server on the same host as the client. OTOH, the 944MByte/s
sound relatively sane when compared to the unicast performance with the
additional overhead of having to copy the data locally.
These 260MB/s seem releatively near to the 2GBit/s effective throughput
of a 1x SDR connection. However, the created group is rate 6 (20GBit/s)
and /sys/class/infiniband/mthca0/ports/1/rate file showed 20 Gb/sec
during the whole test.
The error counters of all ports are showing nothing abnormal. Only the
RcvSwRelayErrors counter of the switch's port (to the host running the
client) is increasing very fast, but this seems to be normal for
multicast packets, as the switch is not relaying these packets back to
the source.
We could test on another cluster with 6 nodes (also with MT25204 HCAs, I
don't know the OFED version and switch type) and got the following results:
Host1 Host2 Host3 Host4 Host5 Host6 Throughput (MByte/s)
1s 1s 1c 255,15
1s 1s 1s 1c 255,22
1s 1s 1s 1s 1c 255,22
1s 1s 1s 1s 1s 1c 255,22
1s1c 1s 1s 738,64
1s1c 1s 1s 1s 695,08
1s1c 1s 1s 1s 1s 565,14
1s1c 1s 1s 1s 1s 1s 451,90
As long as there is no server and client on the same host, it at least
behaves like multicast. When having both client and server on the same
host, performance decreases as the number of servers increases, which is
totally surprising to me.
Another test I did was doing a ib_send_bw (UD) benchmark while the
multicast benchmark was running between A and B. I got ~260MByte/s for
the multicast and also 260MB/s for ib_send_bw.
Has anyone an idea of what is going on there or a hint what I should check?
Regards,
Marcel
More information about the general
mailing list