[openib-general] Re: Speeding up IPoIB.

James Lentini jlentini at netapp.com
Mon Apr 24 08:07:00 PDT 2006



On Fri, 21 Apr 2006, Bernard King-Smith wrote:

> Grant Grundler wrote:
> > Grant> My guess is it's an easier problem to fix SDP than reducing TCP/IP
> > Grant> cache/CPU foot print. I realize only a subset of apps can (or will
> > Grant> try to) use SDP because of setup/config issues.  I still believe
> SDP
> > Grant> is useful to a majority of apps without having to recompile them.
> >
> > I agree that reducing any protocol footprint is a very challenging job,
> > however, going to a larger MTU drops the overhead much faster. If IB
> > supported a 60K MTU then the TCP/IP overhead would be 1/30 that of what
> we
> > measure today. Traversng the TCP/IP stack once for a 60K packet is much
> > lower than 30 times using 2000 byte packets for the same amount of data
> > transmitted.
> 
> Grant> I agree that's effective for workloads which send large messages.
> Grant> And that's typical for storage workloads.
> Grant> But the world is not just an NFS server. ;)
> 
> However, NFS is not the only large data transfer workload I come 
> across. If IB wants to achieve high volumes there needs to be some 
> kind of commercial workload that works well on it besides the large 
> applications that can afford to port to SDP or uDAPL. HPC is a nitch 
> that is hard to sustain a viable business in ( though everyone can 
> point to a couple of companies, most are not long term, 15 years or 
> more ).

Running NFS on IPoIB isn't the only option. As an alternative, NFS can 
be run directly on an RDMA network using the RPC-RDMA protocol.

> However, in clustering the areas that benefit from large block transfer
> efficiency include:
> 
> File Serving ( NFS, GPFS, XFS etc.)
> Application backup
> Parallel databases
> Database upload/update flows
> Web server graphics
> Web server MP3s
> Web server streaming video
> Local workstation backup
> Collaboration software
> Local mail replication
> 
> My concern is that if IB does not support these operations as well as
> Ethernet, then it is a hard sell into commercial accounts/workloads for IB.
> 
> Bernie King-Smith
> IBM Corporation
> Server Group
> Cluster System Performance
> wombat2 at us.ibm.com    (845)433-8483
> Tie. 293-8483 or wombat2 on NOTES
> 
> "We are not responsible for the world we are born into, only for the world
> we leave when we die.
> So we have to accept what has gone before us and work to change the only
> thing we can,
> -- The Future." William Shatner
> 
> 
>                                                                            
>              Grant Grundler                                                
>              <iod00d at hp.com>                                               
>                                                                         To 
>              04/21/2006 04:10          Bernard                             
>              PM                        King-Smith/Poughkeepsie/IBM at IBMUS   
>                                                                         cc 
>                                        Grant Grundler <iod00d at hp.com>,     
>                                        openib-general at openib.org, Roland   
>                                        Dreier <rdreier at cisco.com>          
>                                                                    Subject 
>                                        Re: Speeding up IPoIB.              
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
>                                                                            
> 
> 
> 
> 
> On Thu, Apr 20, 2006 at 09:03:29PM -0400, Bernard King-Smith wrote:
> > Grant> My guess is it's an easier problem to fix SDP than reducing TCP/IP
> > Grant> cache/CPU foot print. I realize only a subset of apps can (or will
> > Grant> try to) use SDP because of setup/config issues.  I still believe
> SDP
> > Grant> is useful to a majority of apps without having to recompile them.
> >
> > I agree that reducing any protocol footprint is a very challenging job,
> > however, going to a larger MTU drops the overhead much faster. If IB
> > supported a 60K MTU then the TCP/IP overhead would be 1/30 that of what
> we
> > measure today. Traversng the TCP/IP stack once for a 60K packet is much
> > lower than 30 times using 2000 byte packets for the same amount of data
> > transmitted.
> 
> I agree that's effective for workloads which send large messages.
> And that's typical for storage workloads.
> But the world is not just an NFS server. ;)
> 
> > Grant> I'm not competent to disagree in detail.
> > Grant> Fabian Tillier and Caitlin Bestler can (and have) addressed this.
> >
> > I would be very interested in any pointers to their work.
> 
> They have posted to this forum recently on this topic.
> The archives are here in case you want to look them up:
>              http://www.openib.org/contact.html
> 
> > This goes back to systems where the system is busy doing nothing,
> generally
> > when waiting for memory or a cache line miss, or I/O to disks. This is
> > where hyperthreading has shown some speedups for benchmarks where
> > previously they were totally CPU limited, and with hyperthreading there
> is
> > a gain.
> 
> While there are workloads that benefit, I don't buy the hyperthreading
> argument in general. Co-workers have demonstrate several "normal"
> workloads that don't benefit and are faster with hyperthreading
> disabled.
> 
> > The unused cycles are "wait" cycles when something can run if it
> > can get in quickly. You can't get a TCP stack in the wait, but small
> parts
> > of the stackor driver could fit in the other thread. Yes I do
> benchmarking
> > and was skeptical at first.
> 
> ok.
> 
> thanks,
> grant
> 
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 



More information about the general mailing list