[ofa-general] Expected RDMA performance
Michael Krause
krause at cup.hp.com
Mon Oct 22 12:57:59 PDT 2007
At 02:06 AM 10/22/2007, Koen Segers wrote:
>On Fri, 2007-10-19 at 09:09 -0700, Michael Krause wrote:
> > At 08:20 AM 10/19/2007, Peter Kjellstrom wrote:
> > > On Thursday 18 October 2007, Chuck Hartley wrote:
> > > ...
> > > > 8388608 5000 1342.12 1342.12
> > > > ------------------------------------------------------------------
> > > >
> > > > Is this typical RDMA performance?
> > >
> > > It's close to what I've seen on similar hw. ~1400 is what you can
> > > push through
> > > the 8x pci-e of the intel 5000 chipset (confirmed by trying 4x pci-e
> > > which
> > > has shown ~700).
> > >
> > > > What is the maximum theoretical BW for
> > > > DDR IB - 1525MB/sec?
> > >
> > > No, it's 20 Gbps on the wire and 8/10 encoded so 16 Gbps effective
> > > which is
> > > 2000 MB/s (10-base) and 1907 MiB/s (2-base).
> >
> > There is also IB protocol overhead combined with driver / device
> > control traffic overhead (consumes device as well as PCI resources /
> > bandwidth), end-to-end control traffic which is also a function of
> > how the application is constructed. In general, hitting about 80-85%
> > of the theoretical maximum is possible.
>
>
>I'm very interested in this result. Can you elaborate this a bit more?
In what regard?
>Has anyone documented the ib traffic control mechanism?
Driver-to-device interactions consume resources and contend for local I/O
bandwidth / local device processing
Application-to-device interactions have similar impacts
ULP exchanges such as SEND operations to communicate protection keys,
addresses, etc.
Host OS / application execution to generate work as well as schedule /
process work and deal with any interrupts / polling mechanisms. This can
lead from zero to significant delays resulting in burst style traffic
patterns. Also many workloads may be small transaction dominate so their
efficiency will be significantly lower than one that is large transaction
dominant.
And so forth.
There are many variables and mileage will vary as a result. Some will do
quite well while others will not. This is why a good range of benchmarks
is required to evaluate whether a given solution is reasonable for the
targeted workloads or problem space. Anyone can contrive something to do
outstanding at one thing while completely biting it in other areas.
Mike
>Regards,
>
>Koen Segers
> >
> > > On our system (with a different HCA) we see quite a difference with
> > > snoop-filter off (bios option). With snoop off (our) application
> > > performance
> > > goes up (not very suprising) but IB performance goes down (latency
> > > 0.4us
> > > worse and bw ~1400->1200).
> >
> > Mike
> > _______________________________________________
> > general mailing list
> > general at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> >
> > To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>*** Disclaimer ***
>
>Vlaamse Radio- en Televisieomroep
>Auguste Reyerslaan 52, 1043 Brussel
>
>nv van publiek recht
>BTW BE 0244.142.664
>RPR Brussel
>http://www.vrt.be/disclaimer
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20071022/de9f7af9/attachment.html>
More information about the general
mailing list