[openib-general] Question about the IPoIB bandwidth performance ?

Bernard King-Smith wombat2 at us.ibm.com
Mon Jun 5 09:53:02 PDT 2006


> Thomas Talpey said:
> At 11:38 AM 6/5/2006, hbchen wrote:
> >Even with this IB-4X = 8Gb/sec = 1024 MB/sec the IPoIB bandwidth
utilization is still very > low.
> >>> IPoIB=420MB/sec
> >>> bandwidth utilization= 420/1024 = 41.01%
>
>
> Helen, have you measured the CPU utilizations during these runs?
> Perhaps you are out of CPU.
>
> Outrageous opinion follows.
>
> Frankly, an IB HCA running Ethernet emulation is approximately the
> world's worst 10GbE adapter (not to put too fine of a point on it :-) )
> There is no hardware checksumming, nor large-send offloading, both
> of which force overhead onto software. And, as you just discovered
> it isn't even 10Gb!
>
> In general, network emulation layers are always going to perform more
> poorly than native implementations. But this is only a generality learned
> from years of experience with them
>
> Tom.

Hold on here....

Who said anything about Ethernnet emulation. Hal said he is running
straight Netperf over IB not ethernet emulation. I don't think that any IB
HCAs today support offloaded checksum and large send. You are comparing
apples and oranges. The only appropriate comparison is to use the IBM HCA
compared to the mthca adapters. I think Hal's point is actually comparing
"any" IB adapter against GigE and Myrinet. Both the mthca and IBM HCA's
should get similar IPoIB performance using identical OpenIB stacks.


Bernie King-Smith
IBM Corporation
Server Group
Cluster System Performance
wombat2 at us.ibm.com    (845)433-8483
Tie. 293-8483 or wombat2 on NOTES

"We are not responsible for the world we are born into, only for the world
we leave when we die.
So we have to accept what has gone before us and work to change the only
thing we can,
-- The Future." William Shatner


                                                                           
             openib-general-re                                             
             quest at openib.org                                              
             Sent by:                                                   To 
             openib-general-bo         openib-general at openib.org           
             unces at openib.org                                           cc 
                                                                           
                                                                   Subject 
             06/05/2006 12:11          openib-general Digest, Vol 24,      
             PM                        Issue 22                            
                                                                           
                                                                           
             Please respond to                                             
             openib-general at op                                             
                 enib.org                                                  
                                                                           
                                                                           




Send openib-general mailing list submissions to
             openib-general at openib.org

To subscribe or unsubscribe via the World Wide Web, visit
             http://openib.org/mailman/listinfo/openib-general
or, via email, send a message with subject or body 'help' to
             openib-general-request at openib.org

You can reach the person managing the list at
             openib-general-owner at openib.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of openib-general digest..."
Today's Topics:

   1. Re: Question about the IPoIB bandwidth           performance ?
(hbchen)
   2. Re: [PATCH] osm: trivial missing header files fix (Hal Rosenstock)
   3. Re: [PATCH] osm: trivial missing cast in         osmt_service call
      for memcmp (Hal Rosenstock)
   4. Re: Question about the IPoIB bandwidth performance ?
      (Bernard King-Smith)
   5. Re: Re: [PATCH]Repost: IPoIB skb panic (Shirley Ma)
   6. Re: [PATCHv2 1/2] resend: mthca support for
max_map_per_fmr
      device attribute (Roland Dreier)
   7. Re: Question about the IPoIB bandwidth performance ?
      (Talpey, Thomas)
   8. Re: Question about the IPoIB bandwidth performance ? (hbchen)

----- Message from "hbchen" <hbchen at lanl.gov> on Mon, 05 Jun 2006 09:38:24
-0600 -----
                                                                           
      To: "Hal Rosenstock" <halr at voltaire.com>                             
                                                                           
      cc: "OPENIB" <openib-general at openib.org>                             
                                                                           
 Subject: Re: [openib-general] Question about the IPoIB bandwidth          
          performance ?                                                    
                                                                           

Hal Rosenstock wrote:
      On Mon, 2006-06-05 at 11:12, hbchen wrote:

            Hi,
            I have a question about the IPoIB bandwidth performance.
            I did netperf testing using Single GiGE, Myrinet D card,
            Myrinet 10G
            ethernet card,
            and Voltaire Infiniband 4X HCA400Ex (PCI-Express interface).


            NIC (Jumbo enabled) Line bandwidth(LB) IPoverNIC bandwidth
            utilization
            (IPoNIC/LB)
            --------------------- ---------------- --------------
            ----------------------------------
            Single Gigabit NIC : 1Gb/sec=125MB/sec 120MB/sec 96% (PIC-X
            interface)
            Myrinet D card : 250MB/sec 240~-245MB/sec 96% ~ 98% (PCI-X
            interface)
            Myrinet 10G Ethernet: 10Gb/sec=1280MB/sec 980MB/sec 76.6% (My
            testing
            using Linux 2.6.14.6)
            (PCI-Express) 1225MB/sec 95.7% (Data from Myrinet website)
            IB HCA4X(PCI-Express): 10Gb/sec=1280MB/sec 420MB/sec 32.8% (My
            testing
            using Linux 2.6.14.6)
            474MB/sec 37% (the best from OpenIB mailing list)
            (2.6.12-rc5 patch 1)

            Why the bandwidth utilization of IPoIB is so low compared to
            the others
            NICs?


      One thing to note is that the max utilization of 10G IB (4x) is 8G
      due
      to the signalling being included in this rate (unlike ethernet whose
      rate represents the data rate and does not include the signalling
      overhead).

Hal,
Even with this IB-4X = 8Gb/sec = 1024 MB/sec the IPoIB bandwidth
utilization is still very low.
>> IPoIB=420MB/sec
>> bandwidth utilization= 420/1024 = 41.01%


HB




      -- Hal


            There must be a lot of room to improve the IPoIB software to
            reach 75%+
            bandwidth utilization.


            HB Chen
            Los Alamos National Lab
            hbchen at labl.gov

            _______________________________________________
            openib-general mailing list
            openib-general at openib.org
            http://openib.org/mailman/listinfo/openib-general

            To unsubscribe, please visit
            http://openib.org/mailman/listinfo/openib-general





----- Message from "Hal Rosenstock" <halr at voltaire.com> on 05 Jun 2006
11:34:50 -0400 -----
                                                                           
       To: "Eitan Zahavi" <eitan at mellanox.co.il>                           
                                                                           
       cc: "OPENIB" <openib-general at openib.org>                            
                                                                           
  Subject: [openib-general] Re: [PATCH] osm: trivial missing header files  
           fix                                                             
                                                                           

On Mon, 2006-06-05 at 08:51, Eitan Zahavi wrote:
> Hi Hal
>
> Cleaning up compilation warnings I found there missing includes in
> various sources.
>
> Eitan
>
> Signed-off-by:  Eitan Zahavi <eitan at mellanox.co.il>

Thanks. Applied to trunk only.

-- Hal



----- Message from "Hal Rosenstock" <halr at voltaire.com> on 05 Jun 2006
11:45:28 -0400 -----
                                                                           
     To: "Eitan Zahavi" <eitan at mellanox.co.il>                             
                                                                           
     cc: "OPENIB" <openib-general at openib.org>                              
                                                                           
 Subject [openib-general] Re: [PATCH] osm: trivial missing cast in         
       : osmt_service call for memcmp                                      
                                                                           

Hi Eitan,

On Mon, 2006-06-05 at 08:59, Eitan Zahavi wrote:
> Hi Hal
>
> Last one of my cleaning up compilation warnings I found a missing
> cast in osmtest service name compare.
>
> Eitan
>
> Signed-off-by:  Eitan Zahavi <eitan at mellanox.co.il>

Thanks. Applied to trunk only.

-- Hal



----- Message from "Bernard King-Smith" <wombat2 at us.ibm.com> on Mon, 5 Jun
2006 11:54:42 -0400 -----
                                                                           
      To: openib-general at openib.org                                        
                                                                           
 Subject: Re: [openib-general] Question about the IPoIB bandwidth          
          performance ?                                                    
                                                                           

Hal Rosenstock wrote:

> On Mon, 2006-06-05 at 11:12, hbchen wrote:
> > Hi,
> > I have a question about the IPoIB bandwidth performance.
> > I did netperf testing using Single GiGE, Myrinet D card, Myrinet 10G
> > ethernet card,
> > and Voltaire Infiniband 4X HCA400Ex (PCI-Express interface).
> >
> >
> > NIC (Jumbo enabled) Line bandwidth(LB) IPoverNIC bandwidth utilization
> > (IPoNIC/LB)
> > --------------------- ---------------- --------------
> > ----------------------------------
> > Single Gigabit NIC : 1Gb/sec=125MB/sec 120MB/sec 96% (PIC-X interface)
> > Myrinet D card : 250MB/sec 240~-245MB/sec 96% ~ 98% (PCI-X interface)
> > Myrinet 10G Ethernet: 10Gb/sec=1280MB/sec 980MB/sec 76.6% (My testing
> > > using Linux 2.6.14.6)
> > (PCI-Express) 1225MB/sec 95.7% (Data from Myrinet website)
> > IB HCA4X(PCI-Express): 10Gb/sec=1280MB/sec 420MB/sec 32.8% (My testing
> > using Linux 2.6.14.6)
> > 474MB/sec 37% (the best from OpenIB mailing list)
> > (2.6.12-rc5 patch 1)
> >
> > Why the bandwidth utilization of IPoIB is so low compared to the others
> > NICs?
>
> One thing to note is that the max utilization of 10G IB (4x) is 8G due
> to the signalling being included in this rate (unlike ethernet whose
> rate represents the data rate and does not include the signalling
> overhead).
>
> -- Hal
>

You also have larger IP packets when you use GigE ( especially in large
send/offload ) and Myrinet. I think Myrinet uses a 60K MTU and for GigE,
without large send you get a 9000 MTU. With large send you get a 64K buffer
to the adapter so fragmentation to 1500/9000 IP packets is offloaded in the
adapter.

Currently with IPoIB using UD mode, you have to generate lots of 2K
packets. With serialized IBoIP drivers you end up bottlenecking on a single
CPU. There is a IPoIB-CM IEFT spec out which should significantly improve
IPoIB performance if implemented.

> > There must be a lot of room to improve the IPoIB software to reach 75%+
> > bandwidth utilization.
> >
> >
> > HB Chen
> > Los Alamos National Lab
> > hbchen at labl.gov
> >
> > _______________________________________________
> > openib-general mailing list
> > openib-general at openib.org
> > http://openib.org/mailman/listinfo/openib-general
> >
> > To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
> >


_______________________________________________
openib-general mailing list
openib-general at openib.org
http://openib.org/mailman/listinfo/openib-general


Bernie King-Smith
IBM Corporation
Server Group
Cluster System Performance
wombat2 at us.ibm.com    (845)433-8483
Tie. 293-8483 or wombat2 on NOTES

"We are not responsible for the world we are born into, only for the world
we leave when we die.
So we have to accept what has gone before us and work to change the only
thing we can,
-- The Future." William Shatner




----- Message from "Shirley Ma" <xma at us.ibm.com> on Mon, 5 Jun 2006
09:02:36 -0700 -----
                                                                           
    To: "Michael S. Tsirkin" <mst at mellanox.co.il>                          
                                                                           
    cc: "Roland Dreier" <rdreier at cisco.com>, mashirle at us.ibm.com,          
        openib-general at openib.org                                          
                                                                           
 Subjec [openib-general] Re: Re: [PATCH]Repost: IPoIB skb panic            
     t:                                                                    
                                                                           

Michael,

I will apply this patch. This patch would reduce the race, not address the
problem.

Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638
----- Message from "Roland Dreier" <rdreier at cisco.com> on Mon, 05 Jun 2006
09:01:14 -0700 -----
                                                                           
    To: "Or Gerlitz" <ogerlitz at voltaire.com>                               
                                                                           
    cc: openib-general at openib.org                                          
                                                                           
 Subjec [openib-general] Re: [PATCHv2 1/2] resend: mthca support for       
     t: max_map_per_fmr device attribute                                   
                                                                           

 > Yes it makes sense, but you need the check should be
 >
 >           if (!(dev->mthca_flags & MTHCA_FLAG_SINAI_OPT))
 >
 > instead of
 >
 >           if (dev->mthca_flags & MTHCA_FLAG_SINAI_OPT)

Yep, you're right, I got it backwards.

 > also, what about the other patch which changes fmr_pool.c to query the
 > device, have you got(reviewed/accepted) it? i have modified it to
 > allocate the device attr struct on the heap as you have asked.

It looks fine.  I was just reviewing everything together.

 - R.


----- Message from "Talpey, Thomas" <Thomas.Talpey at netapp.com> on Mon, 05
Jun 2006 11:52:03 -0400 -----
                                                                           
      To: "hbchen" <hbchen at lanl.gov>                                       
                                                                           
      cc: openib-general at openib.org                                        
                                                                           
 Subject: Re: [openib-general] Question about the IPoIB bandwidth          
          performance ?                                                    
                                                                           

At 11:38 AM 6/5/2006, hbchen wrote:
>Even with this IB-4X = 8Gb/sec = 1024 MB/sec the IPoIB bandwidth
utilization is still very low.
>>> IPoIB=420MB/sec
>>> bandwidth utilization= 420/1024 = 41.01%


Helen, have you measured the CPU utilizations during these runs?
Perhaps you are out of CPU.

Outrageous opinion follows.

Frankly, an IB HCA running Ethernet emulation is approximately the
world's worst 10GbE adapter (not to put too fine of a point on it :-) )
There is no hardware checksumming, nor large-send offloading, both
of which force overhead onto software. And, as you just discovered
it isn't even 10Gb!

In general, network emulation layers are always going to perform more
poorly than native implementations. But this is only a generality learned
from years of experience with them.

Tom.



----- Message from "hbchen" <hbchen at lanl.gov> on Mon, 05 Jun 2006 10:11:30
-0600 -----
                                                                           
      To: "Talpey, Thomas" <Thomas.Talpey at netapp.com>                      
                                                                           
      cc: openib-general at openib.org                                        
                                                                           
 Subject: Re: [openib-general] Question about the IPoIB bandwidth          
          performance ?                                                    
                                                                           

Talpey, Thomas wrote:
      At 11:38 AM 6/5/2006, hbchen wrote:

            Even with this IB-4X = 8Gb/sec = 1024 MB/sec the IPoIB
            bandwidth utilization is still very low.

                        IPoIB=420MB/sec
                        bandwidth utilization= 420/1024 = 41.01%



      Helen, have you measured the CPU utilizations during these runs?
      Perhaps you are out of CPU.


Tom,
I am HB Chen from LANL not the Helen Chen from SNL.
I didn't run out of CPU.  It is about 70-80 % of CPU utilization.

      Outrageous opinion follows.

      Frankly, an IB HCA running Ethernet emulation is approximately the
      world's worst 10GbE adapter (not to put too fine of a point on it :-)
      )

The IP over Myrinet ( Ethernet emulation) can reach upto 96%-98%  bandwidth
utilization why not the IPoIB ?

HB Chen
hbchen at lanl.gov
      There is no hardware checksumming, nor large-send offloading, both
      of which force overhead onto software. And, as you just discovered
      it isn't even 10Gb!

      In general, network emulation layers are always going to perform more
      poorly than native implementations. But this is only a generality
      learned
      from years of experience with them.

      Tom.


_______________________________________________
openib-general mailing list
openib-general at openib.org
http://openib.org/mailman/listinfo/openib-general





More information about the general mailing list