[ofa-general] ***SPAM*** debian/ofed-1.4 - mpi global communication performance

Markus Uhlmann mossy.boulders at gmail.com
Fri Feb 6 04:34:10 PST 2009


Hi all,

we have been struggling with the performance of a supermicro
(quad-core xeon) / qlogic (9024-FC) system running Debian, kernel
2.6.24-x86_64, and ofed-1.4 (from http://www.openfabrics.org/).
There are 8 nodes attached to the switch.

What happens is that the performance of MPI global communication is
extremely low (i.e. ~ factor 10 when 16 procs out of only 2 nodes
communicate). This number comes from comparison with a *similar*
system (dell/cisco).

Some test which we have performed:

* local memory bandwidth test ("stream" benchmark on 8-way node
  returns >8GB/s)

* firmware: since the hca's are on-board supermicro (board_id:
  SM_2001000001; firmware-version: 1.2.0) I don't know how/where to
  check adequacy.

* openib low-level communication tests seem okay (see output from
  ib_write_lat, ib_write_bw below)

* However, I see errors of type "RcvSwRelayErrors" when checking
  "ibcheckerrors". Is this normal?

* Mpi benchmarks reveal slow all-to-all communication (see output
  below for "osu_alltoall" test

https://mvapich.cse.ohio-state.edu/svn/mpi-benchmarks/branches/OMB-3.1/osu_alltoall.c
,
  compiled with openmpi-1.3 and intel compiler 11.0)


Some questions I have:

1) Do I have to configure the switch?
   So far I have not attempted to install the "ofed+" etc. software
   which came with the qlogic hardware. Is there any chance that it
   would be compatible with ofed-1.4? Or even installable under Debian
   (without too much tweaking)?

2) Is it okay for this system to run "opensm" on one of the nodes?
   NOTE: the version is "OpenSM 3.2.5_20081207"

Any other lead or things I should test?

Thanks in advance,

MU

==============================================================
------------------------------------------------------------------
                    RDMA_Write Latency Test
Inline data is used up to 400 bytes message
Connection type : RC
Mtu : 2048
------------------------------------------------------------------
 #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]
      2        1000           3.10          22.88             3.15
      4        1000           3.13           6.29             3.16
      8        1000           3.14           6.24             3.18
     16        1000           3.17           6.25             3.21
     32        1000           3.25           7.60             3.38
     64        1000           3.32           6.43             3.45
    128        1000           3.48           6.40             3.57
    256        1000           3.77           6.63             3.82
    512        1000           4.71           8.44             4.76
   1024        1000           5.58           7.53             5.63
   2048        1000           7.38           8.17             7.51
   4096        1000           8.64           9.04             8.77
   8192        1000          11.41          11.81            11.57
  16384        1000          16.55          17.27            16.71
  32768        1000          26.81          28.12            27.01
  65536        1000          47.41          49.43            47.62
 131072        1000          89.86          91.98            90.81
 262144        1000         174.25         176.34           175.35
 524288        1000         343.03         344.79           343.51
1048576        1000         679.04         680.57           679.72
2097152        1000        1350.88        1352.80          1351.75
4194304        1000        2693.31        2696.13          2694.50
8388608        1000        5380.45        5383.29          5381.62
------------------------------------------------------------------
------------------------------------------------------------------
                    RDMA_Write BW Test
Number of qp's running 1
Connection type : RC
Each Qp will post up to 100 messages each time
Mtu : 2048
------------------------------------------------------------------
 #bytes #iterations    BW peak[MB/sec]    BW average[MB/sec]
      2        5000               2.51                  2.51
      4        5000               5.03                  5.03
      8        5000              10.09                 10.09
     16        5000              19.71                 19.70
     32        5000              39.23                 39.22
     64        5000              77.91                 77.84
    128        5000             146.67                146.53
    256        5000             223.14                222.82
    512        5000             640.09                639.80
   1024        5000            1106.72               1106.22
   2048        5000            1271.61               1270.87
   4096        5000            1379.58               1379.44
   8192        5000            1446.01               1445.95
  16384        5000            1477.11               1477.09
  32768        5000            1498.18               1498.17
  65536        5000            1507.23               1507.22
 131072        5000            1511.83               1511.82
 262144        5000            1487.64               1487.62
 524288        5000            1485.76               1485.75
1048576        5000            1487.13               1486.54
2097152        5000            1487.95               1487.95
4194304        5000            1488.11               1488.10
8388608        5000            1488.22               1488.22
------------------------------------------------------------------
***************OUR-SYSTEM /supermicro-qlogic:********************
# OSU MPI All-to-All Personalized Exchange Latency Test v3.1.1
# Size            Latency (us)
1                         7.87
2                         7.80
4                         7.77
8                         7.78
16                        7.81
32                        9.00
64                        9.00
128                      10.15
256                      11.75
512                      15.55
1024                     23.54
2048                     40.57
4096                    107.12
8192                    187.28
16384                   343.61
32768                   602.17
65536                  1135.20
131072                 3086.28
262144                 9086.50
524288                18713.30
1048576               37378.61
------------------------------------------------------------------
**************REFERENCE_SYSTEM / dell-cisco:***********************
# OSU MPI All-to-All Personalized Exchange Latency Test v3.1.1
# Size            Latency (us)
1                        16.14
2                        15.93
4                        16.25
8                        16.60
16                       25.83
32                       28.66
64                       33.57
128                      40.94
256                      56.20
512                      91.24
1024                    156.13
2048                    373.17
4096                    696.95
8192                   1464.89
16384                  1367.96
32768                  2499.21
65536                  5686.46
131072                11065.98
262144                23922.69
524288                49294.71
1048576              101290.67
==============================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090206/99556eff/attachment.html>


More information about the general mailing list