[openib-general] Question about the IPoIB bandwidth performance ?
Talpey, Thomas
Thomas.Talpey at netapp.com
Mon Jun 5 10:08:17 PDT 2006
>Who said anything about Ethernnet emulation. Hal said he is running
>straight Netperf over IB not ethernet emulation. I don't think that any IB
>HCAs today support offloaded checksum and large send. You are comparing
>apples and oranges.
I consider IPoIB to be Ethernet emulation.
As for apples and oranges, my point exactly.
Tom.
At 12:53 PM 6/5/2006, Bernard King-Smith wrote:
>> Thomas Talpey said:
>> At 11:38 AM 6/5/2006, hbchen wrote:
>> >Even with this IB-4X = 8Gb/sec = 1024 MB/sec the IPoIB bandwidth
>utilization is still very > low.
>> >>> IPoIB=420MB/sec
>> >>> bandwidth utilization= 420/1024 = 41.01%
>>
>>
>> Helen, have you measured the CPU utilizations during these runs?
>> Perhaps you are out of CPU.
>>
>> Outrageous opinion follows.
>>
>> Frankly, an IB HCA running Ethernet emulation is approximately the
>> world's worst 10GbE adapter (not to put too fine of a point on it :-) )
>> There is no hardware checksumming, nor large-send offloading, both
>> of which force overhead onto software. And, as you just discovered
>> it isn't even 10Gb!
>>
>> In general, network emulation layers are always going to perform more
>> poorly than native implementations. But this is only a generality learned
>> from years of experience with them
>>
>> Tom.
>
>Hold on here....
>
>Who said anything about Ethernnet emulation. Hal said he is running
>straight Netperf over IB not ethernet emulation. I don't think that any IB
>HCAs today support offloaded checksum and large send. You are comparing
>apples and oranges. The only appropriate comparison is to use the IBM HCA
>compared to the mthca adapters. I think Hal's point is actually comparing
>"any" IB adapter against GigE and Myrinet. Both the mthca and IBM HCA's
>should get similar IPoIB performance using identical OpenIB stacks.
>
>
>Bernie King-Smith
>IBM Corporation
>Server Group
>Cluster System Performance
>wombat2 at us.ibm.com (845)433-8483
>Tie. 293-8483 or wombat2 on NOTES
>
>"We are not responsible for the world we are born into, only for the world
>we leave when we die.
>So we have to accept what has gone before us and work to change the only
>thing we can,
>-- The Future." William Shatner
>
>
>
> openib-general-re
> quest at openib.org
> Sent by: To
> openib-general-bo openib-general at openib.org
> unces at openib.org cc
>
> Subject
> 06/05/2006 12:11 openib-general Digest, Vol 24,
> PM Issue 22
>
>
> Please respond to
> openib-general at op
> enib.org
>
>
>
>
>
>
>Send openib-general mailing list submissions to
> openib-general at openib.org
>
>To subscribe or unsubscribe via the World Wide Web, visit
> http://openib.org/mailman/listinfo/openib-general
>or, via email, send a message with subject or body 'help' to
> openib-general-request at openib.org
>
>You can reach the person managing the list at
> openib-general-owner at openib.org
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of openib-general digest..."
>Today's Topics:
>
> 1. Re: Question about the IPoIB bandwidth performance ?
>(hbchen)
> 2. Re: [PATCH] osm: trivial missing header files fix (Hal Rosenstock)
> 3. Re: [PATCH] osm: trivial missing cast in osmt_service call
> for memcmp (Hal Rosenstock)
> 4. Re: Question about the IPoIB bandwidth performance ?
> (Bernard King-Smith)
> 5. Re: Re: [PATCH]Repost: IPoIB skb panic (Shirley Ma)
> 6. Re: [PATCHv2 1/2] resend: mthca support for
>max_map_per_fmr
> device attribute (Roland Dreier)
> 7. Re: Question about the IPoIB bandwidth performance ?
> (Talpey, Thomas)
> 8. Re: Question about the IPoIB bandwidth performance ? (hbchen)
>
>----- Message from "hbchen" <hbchen at lanl.gov> on Mon, 05 Jun 2006 09:38:24
>-0600 -----
>
> To: "Hal Rosenstock" <halr at voltaire.com>
>
> cc: "OPENIB" <openib-general at openib.org>
>
> Subject: Re: [openib-general] Question about the IPoIB bandwidth
> performance ?
>
>
>Hal Rosenstock wrote:
> On Mon, 2006-06-05 at 11:12, hbchen wrote:
>
> Hi,
> I have a question about the IPoIB bandwidth performance.
> I did netperf testing using Single GiGE, Myrinet D card,
> Myrinet 10G
> ethernet card,
> and Voltaire Infiniband 4X HCA400Ex (PCI-Express interface).
>
>
> NIC (Jumbo enabled) Line bandwidth(LB) IPoverNIC bandwidth
> utilization
> (IPoNIC/LB)
> --------------------- ---------------- --------------
> ----------------------------------
> Single Gigabit NIC : 1Gb/sec=125MB/sec 120MB/sec 96% (PIC-X
> interface)
> Myrinet D card : 250MB/sec 240~-245MB/sec 96% ~ 98% (PCI-X
> interface)
> Myrinet 10G Ethernet: 10Gb/sec=1280MB/sec 980MB/sec 76.6% (My
> testing
> using Linux 2.6.14.6)
> (PCI-Express) 1225MB/sec 95.7% (Data from Myrinet website)
> IB HCA4X(PCI-Express): 10Gb/sec=1280MB/sec 420MB/sec 32.8% (My
> testing
> using Linux 2.6.14.6)
> 474MB/sec 37% (the best from OpenIB mailing list)
> (2.6.12-rc5 patch 1)
>
> Why the bandwidth utilization of IPoIB is so low compared to
> the others
> NICs?
>
>
> One thing to note is that the max utilization of 10G IB (4x) is 8G
> due
> to the signalling being included in this rate (unlike ethernet whose
> rate represents the data rate and does not include the signalling
> overhead).
>
>Hal,
>Even with this IB-4X = 8Gb/sec = 1024 MB/sec the IPoIB bandwidth
>utilization is still very low.
>>> IPoIB=420MB/sec
>>> bandwidth utilization= 420/1024 = 41.01%
>
>
>HB
>
>
>
>
> -- Hal
>
>
> There must be a lot of room to improve the IPoIB software to
> reach 75%+
> bandwidth utilization.
>
>
> HB Chen
> Los Alamos National Lab
> hbchen at labl.gov
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
>
>
>
>
>----- Message from "Hal Rosenstock" <halr at voltaire.com> on 05 Jun 2006
>11:34:50 -0400 -----
>
> To: "Eitan Zahavi" <eitan at mellanox.co.il>
>
> cc: "OPENIB" <openib-general at openib.org>
>
> Subject: [openib-general] Re: [PATCH] osm: trivial missing header files
> fix
>
>
>On Mon, 2006-06-05 at 08:51, Eitan Zahavi wrote:
>> Hi Hal
>>
>> Cleaning up compilation warnings I found there missing includes in
>> various sources.
>>
>> Eitan
>>
>> Signed-off-by: Eitan Zahavi <eitan at mellanox.co.il>
>
>Thanks. Applied to trunk only.
>
>-- Hal
>
>
>
>----- Message from "Hal Rosenstock" <halr at voltaire.com> on 05 Jun 2006
>11:45:28 -0400 -----
>
> To: "Eitan Zahavi" <eitan at mellanox.co.il>
>
> cc: "OPENIB" <openib-general at openib.org>
>
> Subject [openib-general] Re: [PATCH] osm: trivial missing cast in
> : osmt_service call for memcmp
>
>
>Hi Eitan,
>
>On Mon, 2006-06-05 at 08:59, Eitan Zahavi wrote:
>> Hi Hal
>>
>> Last one of my cleaning up compilation warnings I found a missing
>> cast in osmtest service name compare.
>>
>> Eitan
>>
>> Signed-off-by: Eitan Zahavi <eitan at mellanox.co.il>
>
>Thanks. Applied to trunk only.
>
>-- Hal
>
>
>
>----- Message from "Bernard King-Smith" <wombat2 at us.ibm.com> on Mon, 5 Jun
>2006 11:54:42 -0400 -----
>
> To: openib-general at openib.org
>
> Subject: Re: [openib-general] Question about the IPoIB bandwidth
> performance ?
>
>
>Hal Rosenstock wrote:
>
>> On Mon, 2006-06-05 at 11:12, hbchen wrote:
>> > Hi,
>> > I have a question about the IPoIB bandwidth performance.
>> > I did netperf testing using Single GiGE, Myrinet D card, Myrinet 10G
>> > ethernet card,
>> > and Voltaire Infiniband 4X HCA400Ex (PCI-Express interface).
>> >
>> >
>> > NIC (Jumbo enabled) Line bandwidth(LB) IPoverNIC bandwidth utilization
>> > (IPoNIC/LB)
>> > --------------------- ---------------- --------------
>> > ----------------------------------
>> > Single Gigabit NIC : 1Gb/sec=125MB/sec 120MB/sec 96% (PIC-X interface)
>> > Myrinet D card : 250MB/sec 240~-245MB/sec 96% ~ 98% (PCI-X interface)
>> > Myrinet 10G Ethernet: 10Gb/sec=1280MB/sec 980MB/sec 76.6% (My testing
>> > > using Linux 2.6.14.6)
>> > (PCI-Express) 1225MB/sec 95.7% (Data from Myrinet website)
>> > IB HCA4X(PCI-Express): 10Gb/sec=1280MB/sec 420MB/sec 32.8% (My testing
>> > using Linux 2.6.14.6)
>> > 474MB/sec 37% (the best from OpenIB mailing list)
>> > (2.6.12-rc5 patch 1)
>> >
>> > Why the bandwidth utilization of IPoIB is so low compared to the others
>> > NICs?
>>
>> One thing to note is that the max utilization of 10G IB (4x) is 8G due
>> to the signalling being included in this rate (unlike ethernet whose
>> rate represents the data rate and does not include the signalling
>> overhead).
>>
>> -- Hal
>>
>
>You also have larger IP packets when you use GigE ( especially in large
>send/offload ) and Myrinet. I think Myrinet uses a 60K MTU and for GigE,
>without large send you get a 9000 MTU. With large send you get a 64K buffer
>to the adapter so fragmentation to 1500/9000 IP packets is offloaded in the
>adapter.
>
>Currently with IPoIB using UD mode, you have to generate lots of 2K
>packets. With serialized IBoIP drivers you end up bottlenecking on a single
>CPU. There is a IPoIB-CM IEFT spec out which should significantly improve
>IPoIB performance if implemented.
>
>> > There must be a lot of room to improve the IPoIB software to reach 75%+
>> > bandwidth utilization.
>> >
>> >
>> > HB Chen
>> > Los Alamos National Lab
>> > hbchen at labl.gov
>> >
>> > _______________________________________________
>> > openib-general mailing list
>> > openib-general at openib.org
>> > http://openib.org/mailman/listinfo/openib-general
>> >
>> > To unsubscribe, please visit
>http://openib.org/mailman/listinfo/openib-general
>> >
>
>
>_______________________________________________
>openib-general mailing list
>openib-general at openib.org
>http://openib.org/mailman/listinfo/openib-general
>
>
>Bernie King-Smith
>IBM Corporation
>Server Group
>Cluster System Performance
>wombat2 at us.ibm.com (845)433-8483
>Tie. 293-8483 or wombat2 on NOTES
>
>"We are not responsible for the world we are born into, only for the world
>we leave when we die.
>So we have to accept what has gone before us and work to change the only
>thing we can,
>-- The Future." William Shatner
>
>
>
>
>----- Message from "Shirley Ma" <xma at us.ibm.com> on Mon, 5 Jun 2006
>09:02:36 -0700 -----
>
> To: "Michael S. Tsirkin" <mst at mellanox.co.il>
>
> cc: "Roland Dreier" <rdreier at cisco.com>, mashirle at us.ibm.com,
> openib-general at openib.org
>
> Subjec [openib-general] Re: Re: [PATCH]Repost: IPoIB skb panic
> t:
>
>
>Michael,
>
>I will apply this patch. This patch would reduce the race, not address the
>problem.
>
>Thanks
>Shirley Ma
>IBM Linux Technology Center
>15300 SW Koll Parkway
>Beaverton, OR 97006-6063
>Phone(Fax): (503) 578-7638
>----- Message from "Roland Dreier" <rdreier at cisco.com> on Mon, 05 Jun 2006
>09:01:14 -0700 -----
>
> To: "Or Gerlitz" <ogerlitz at voltaire.com>
>
> cc: openib-general at openib.org
>
> Subjec [openib-general] Re: [PATCHv2 1/2] resend: mthca support for
> t: max_map_per_fmr device attribute
>
>
> > Yes it makes sense, but you need the check should be
> >
> > if (!(dev->mthca_flags & MTHCA_FLAG_SINAI_OPT))
> >
> > instead of
> >
> > if (dev->mthca_flags & MTHCA_FLAG_SINAI_OPT)
>
>Yep, you're right, I got it backwards.
>
> > also, what about the other patch which changes fmr_pool.c to query the
> > device, have you got(reviewed/accepted) it? i have modified it to
> > allocate the device attr struct on the heap as you have asked.
>
>It looks fine. I was just reviewing everything together.
>
> - R.
>
>
>----- Message from "Talpey, Thomas" <Thomas.Talpey at netapp.com> on Mon, 05
>Jun 2006 11:52:03 -0400 -----
>
> To: "hbchen" <hbchen at lanl.gov>
>
> cc: openib-general at openib.org
>
> Subject: Re: [openib-general] Question about the IPoIB bandwidth
> performance ?
>
>
>At 11:38 AM 6/5/2006, hbchen wrote:
>>Even with this IB-4X = 8Gb/sec = 1024 MB/sec the IPoIB bandwidth
>utilization is still very low.
>>>> IPoIB=420MB/sec
>>>> bandwidth utilization= 420/1024 = 41.01%
>
>
>Helen, have you measured the CPU utilizations during these runs?
>Perhaps you are out of CPU.
>
>Outrageous opinion follows.
>
>Frankly, an IB HCA running Ethernet emulation is approximately the
>world's worst 10GbE adapter (not to put too fine of a point on it :-) )
>There is no hardware checksumming, nor large-send offloading, both
>of which force overhead onto software. And, as you just discovered
>it isn't even 10Gb!
>
>In general, network emulation layers are always going to perform more
>poorly than native implementations. But this is only a generality learned
>from years of experience with them.
>
>Tom.
>
>
>
>----- Message from "hbchen" <hbchen at lanl.gov> on Mon, 05 Jun 2006 10:11:30
>-0600 -----
>
> To: "Talpey, Thomas" <Thomas.Talpey at netapp.com>
>
> cc: openib-general at openib.org
>
> Subject: Re: [openib-general] Question about the IPoIB bandwidth
> performance ?
>
>
>Talpey, Thomas wrote:
> At 11:38 AM 6/5/2006, hbchen wrote:
>
> Even with this IB-4X = 8Gb/sec = 1024 MB/sec the IPoIB
> bandwidth utilization is still very low.
>
> IPoIB=420MB/sec
> bandwidth utilization= 420/1024 = 41.01%
>
>
>
> Helen, have you measured the CPU utilizations during these runs?
> Perhaps you are out of CPU.
>
>
>Tom,
>I am HB Chen from LANL not the Helen Chen from SNL.
>I didn't run out of CPU. It is about 70-80 % of CPU utilization.
>
> Outrageous opinion follows.
>
> Frankly, an IB HCA running Ethernet emulation is approximately the
> world's worst 10GbE adapter (not to put too fine of a point on it :-)
> )
>
>The IP over Myrinet ( Ethernet emulation) can reach upto 96%-98% bandwidth
>utilization why not the IPoIB ?
>
>HB Chen
>hbchen at lanl.gov
> There is no hardware checksumming, nor large-send offloading, both
> of which force overhead onto software. And, as you just discovered
> it isn't even 10Gb!
>
> In general, network emulation layers are always going to perform more
> poorly than native implementations. But this is only a generality
> learned
> from years of experience with them.
>
> Tom.
>
>
>_______________________________________________
>openib-general mailing list
>openib-general at openib.org
>http://openib.org/mailman/listinfo/openib-general
More information about the general
mailing list