[openfabrics-ewg] MVAPICH on PCI-X fails with [0] Abort: Couldn'tmodify SRQ limit
Sayantan Sur
surs at cse.ohio-state.edu
Fri May 5 14:33:52 PDT 2006
Hello,
> I believe we would need to get MVAPICH changed so that when it compiles for
> PCI_X it compiles it with SRQ support and the mvapich.make script needs to be
> fixed so that it correctly identifies the IB card as PCI_X.
I upgraded the firmware on the PCI-X cards (MT23108) to the 3.4 version.
With this upgrade, MVAPICH can run using the SRQ limit event. However,
on modifying a Gen2 level performance test (perftest/send_lat.c) to use
SRQ (instead of send/receive) I saw degraded performance of SRQ on PCI-X
cards. Here is the data I collected:
# Msg SRQ SR
2 10.65 6.13
4 10.68 6.11
8 10.69 6.13
16 10.76 6.19
32 10.72 6.22
64 10.83 6.36
128 11.01 6.55
256 11.50 7.06
512 12.48 7.96
1024 13.76 9.27
I am not sure whether running MVAPICH on PCI-X based clusters with such
degraded small message performance will be optimal. The current
configuration for PCI-X clusters uses the adaptive RDMA path which is
quite scalable in the first place.
Thanks,
Sayantan.
--
http://www.cse.ohio-state.edu/~surs
More information about the ewg
mailing list