[ofa-general] Performance penalty of OFED 1.1 versus IBGD 1.8.2
Sayantan Sur
surs at cse.ohio-state.edu
Wed Feb 28 08:40:40 PST 2007
Hi Roland,
* On Feb,2 Pavel Shamis (Pasha)<pasha at dev.mellanox.co.il> wrote :
> Roland Fehrenbacher wrote:
> >>>>>>"Pavel" == Pavel Shamis <(Pasha)" <pasha at dev.mellanox.co.il>> writes:
> >
> > Pavel> Hi Roland,
> > >> >> I'm migrating from IBGD 1.8.2 (kernel 2.6.15.7) to OFED 1.1,
> > >> >> and saw some unpleasant performance drops when using OFED
> > >> 1.1 >> (kernel 2.6.20.1 with included IB drivers). The main
> > >> drop is in >> throughput as measured by the OSU MPI bandwidth
> > >> >> benchmark. However, the latency for large packet sizes is
> > >> also >> worse (see results below). I tried with and without
> > >> "options >> ib_mthca msi_x=1" (using IBGD, disabling msi_x
> > >> makes a >> siginficant performance difference of
> > >> approx. 10%). The IB card >> is a Mellanox MHGS18-XT (PCIe/DDR
> > >> Firmware 1.2.0) running on an >> Opteron with nForce4 2200
> > >> Professional chipset.
> > >> >>
> > >> >> Does anybody have an explanation or even better a solution
> > >> to >> this issue?
> > >>
> >
> > Pavel> Please try to add follow mvapich parameter :
> > Pavel> VIADEV_DEFAULT_MTU=MTU2048
> > >> Thanks for the suggestion. Unfortunately, it didn't improve the
> > >> simple bandwidth results. Bi-directional bandwidth increased by
> > >> 3% though. Any more ideas?
> >
> > Pavel> 3% is good start :-) Please also try to add this one:
> > Pavel> VIADEV_MAX_RDMA_SIZE=4194304
> >
> >This brought another 2% in bi-directional bandwidth, but still nothing
> >in uni-directional bandwidth.
> >
> >mvapich version is 0.9.8
> 0.9.8 was not distributed (and tested) with OFED 1.1 :-(
> Please try to use package distributed with OFED 1.1 version.
MVAPICH-0.9.8 was tested by the MVAPICH team on OFED 1.1. It is being
used at several production clusters with OFED 1.1.
I ran the bandwidth test on our Opteron nodes, AMD Processor 254 (2.8
GHz), with Mellanox dual-port DDR cards. I can see a peak bandwidth of
1402 MillionBytes/sec as reported by OSU Bandwidth test. On the same
machines, I ran ib_rdma_bw (in the perftest module of OFED-1.1), which
reports lower Gen2 level performance numbers. The peak bw reported by
ib_rdma_bw is 1307.09 MegaBytes/sec (=1338.09*1.048 = 1402
MillionBytes/sec). So, the lower level numbers match up to what is
reported by MPI.
I'm wondering how your lower-level ib_rdma_bw numbers look like? Are
they matching up with what OSU BW test reports? If they are, then it is
likely some other issue than MPI.
We also have a MVAPICH-0.9.9 beta version out. You could give that a try
too, if you want. We will be making the full release soon.
Thanks,
Sayantan.
--
http://www.cse.ohio-state.edu/~surs
More information about the general
mailing list