[openib-general] Re: ia64 perf and FMR

Michael S. Tsirkin mst at mellanox.co.il
Sun Apr 3 10:35:48 PDT 2005


Quoting r. Grant Grundler <iod00d at hp.com>:
> Subject: ia64 perf and FMR
> 
> Hi,
> Just wanted to share initial perf results (and surprise)
> that I'm getting on the HP ZX1/IA64 boxes.
> 
> Before FMR support was committed, netperf was reporting around
> 1720 Mb/s (215 MB/s) for IPoIB with msi_x=1 and netserver pinned
> to the CPU that wasn't taking interrupts. After FMR was committed,
> netperf is reporting about 3500 Mb/s (437 MB/s) for IPoIB. CPU was
> saturated on the send side in all cases.
> 
> I've a vague idea what "Fast Memory Registration" is but not a good
> understanding.  Can someone point me at a decent explanation of FMR?
> 
> I'd like to understand the 2X in performance.
> Maybe we are doing 1/2 as much DMA mapping in one of
> the bug fixes?
> 
> And I'm suspicious of the IPoIB numbers since SDP is also seeing
> a bit over 3500 Mb/s and sending CPU is also saturated. I was hoping
> SDP would be 40-60% faster than TCP (ipoib). Maybe I'm just not
> configuring libsdp.conf correctly for netperf and maybe the IPoIB
> numbers are correct.  I've "rmmod ib_sdp" on both boxes, unloaded
> and reloaded all the other IB drivers, and "unset LD_PRELOAD".
> Is unloading ib_sdp sufficient to be sure SDP isn't used?
> 
> (I do get "module in use" when netserver is running with LD_PRELOAD
> pointing at libsdp.so)
> 
> 
> I also reviewed all the "__attribute__ ((packed))" uses in
> include/ib_mad.h and include/ib_smi.h. It looks safe to me
> to remove them since every field is "naturally" aligned from
> the start of it's respective structure. I also checked
> nested cases. However, while it worked fine, removing all use
> from the two files didn't matter for netperf TCP_STREAM.
> 
> I didn't realize other files also use "packed" and will
> have to revisit the issue. I'm mostly worried some
> new use will not be well aligned and cause the compiler
> to insert padding. That will be a PITA to debug.
> What we need is a compiler warning to tell us when/where
> padding is inserted in a structure with a similar __attribute__.
> 
> Reminder: not pinning the netserver thread to the other CPU
> costs around 25% performance. I think that's true for any single
> threaded networking perf test that saturates the CPU.
> 
> thanks,
> grant

Can you try with hide DDR? this will disable FMRs for tavor.


-- 
MST - Michael S. Tsirkin



More information about the general mailing list