[openib-general] ia64 perf and FMR

Roland Dreier roland at topspin.com
Mon Apr 4 07:06:53 PDT 2005


    Grant> Before FMR support was committed, netperf was reporting
    Grant> around 1720 Mb/s (215 MB/s) for IPoIB with msi_x=1 and
    Grant> netserver pinned to the CPU that wasn't taking
    Grant> interrupts. After FMR was committed, netperf is reporting
    Grant> about 3500 Mb/s (437 MB/s) for IPoIB. CPU was saturated on
    Grant> the send side in all cases.

    Grant> I've a vague idea what "Fast Memory Registration" is but
    Grant> not a good understanding.  Can someone point me at a decent
    Grant> explanation of FMR?

A memory region (MR) is a memory translation mapping in the HCA's
context.  Usually, we create MRs via a firmware command, which is
prohibitively expensive to do in the data path.  However, it is
possible for the driver to write directly into the HCA's context,
bypassing the firmware.  This is very cheap, just some posted writes,
and so we can do it in the data path.  For example, for AIO, SDP uses
this to map a bunch of random userspace pages into something virtually
contiguous in the HCA's memory map, so that it can be used as for RDMA.

However this shouldn't affect IPoIB in the least since a) it doesn't
do any dynamic memory registration and b) it doesn't call any FMR
functions anyway.

    Grant> I'd like to understand the 2X in performance.  Maybe we are
    Grant> doing 1/2 as much DMA mapping in one of the bug fixes?

    Grant> And I'm suspicious of the IPoIB numbers since SDP is also
    Grant> seeing a bit over 3500 Mb/s and sending CPU is also
    Grant> saturated. I was hoping SDP would be 40-60% faster than TCP
    Grant> (ipoib). Maybe I'm just not configuring libsdp.conf
    Grant> correctly for netperf and maybe the IPoIB numbers are
    Grant> correct.  I've "rmmod ib_sdp" on both boxes, unloaded and
    Grant> reloaded all the other IB drivers, and "unset LD_PRELOAD".
    Grant> Is unloading ib_sdp sufficient to be sure SDP isn't used?

This is really odd.  I don't see how FMRs could directly change IPoIB
performance, since IPoIB isn't using FMRs, even indirectly.  If SDP is
not loaded, then I don't see how it could be used, but the fact that
you get the same number for SDP and IPoIB really makes me think that
the IPoIB number is really an SDP number.

    Grant> I also reviewed all the "__attribute__ ((packed))" uses in
    Grant> include/ib_mad.h and include/ib_smi.h. It looks safe to me
    Grant> to remove them since every field is "naturally" aligned
    Grant> from the start of it's respective structure. I also checked
    Grant> nested cases. However, while it worked fine, removing all
    Grant> use from the two files didn't matter for netperf
    Grant> TCP_STREAM.

Yeah, none of that code is in the data path, so I wouldn't expect it
to make a difference one way or another.

The one that might make a difference is struct mthca_eqe in
mthca_eq.c.  Unfortunately simply removing the packed attribute will
break things on 64 bit archs unless the structure is written slightly
differently.  It shouldn't be that difficult, so I should have
something for you to test in a day or two.

 - R.



More information about the general mailing list