[openib-general] Re: Speeding up IPoIB.
James Lentini
jlentini at netapp.com
Mon Apr 24 08:07:00 PDT 2006
On Fri, 21 Apr 2006, Bernard King-Smith wrote:
> Grant Grundler wrote:
> > Grant> My guess is it's an easier problem to fix SDP than reducing TCP/IP
> > Grant> cache/CPU foot print. I realize only a subset of apps can (or will
> > Grant> try to) use SDP because of setup/config issues. I still believe
> SDP
> > Grant> is useful to a majority of apps without having to recompile them.
> >
> > I agree that reducing any protocol footprint is a very challenging job,
> > however, going to a larger MTU drops the overhead much faster. If IB
> > supported a 60K MTU then the TCP/IP overhead would be 1/30 that of what
> we
> > measure today. Traversng the TCP/IP stack once for a 60K packet is much
> > lower than 30 times using 2000 byte packets for the same amount of data
> > transmitted.
>
> Grant> I agree that's effective for workloads which send large messages.
> Grant> And that's typical for storage workloads.
> Grant> But the world is not just an NFS server. ;)
>
> However, NFS is not the only large data transfer workload I come
> across. If IB wants to achieve high volumes there needs to be some
> kind of commercial workload that works well on it besides the large
> applications that can afford to port to SDP or uDAPL. HPC is a nitch
> that is hard to sustain a viable business in ( though everyone can
> point to a couple of companies, most are not long term, 15 years or
> more ).
Running NFS on IPoIB isn't the only option. As an alternative, NFS can
be run directly on an RDMA network using the RPC-RDMA protocol.
> However, in clustering the areas that benefit from large block transfer
> efficiency include:
>
> File Serving ( NFS, GPFS, XFS etc.)
> Application backup
> Parallel databases
> Database upload/update flows
> Web server graphics
> Web server MP3s
> Web server streaming video
> Local workstation backup
> Collaboration software
> Local mail replication
>
> My concern is that if IB does not support these operations as well as
> Ethernet, then it is a hard sell into commercial accounts/workloads for IB.
>
> Bernie King-Smith
> IBM Corporation
> Server Group
> Cluster System Performance
> wombat2 at us.ibm.com (845)433-8483
> Tie. 293-8483 or wombat2 on NOTES
>
> "We are not responsible for the world we are born into, only for the world
> we leave when we die.
> So we have to accept what has gone before us and work to change the only
> thing we can,
> -- The Future." William Shatner
>
>
>
> Grant Grundler
> <iod00d at hp.com>
> To
> 04/21/2006 04:10 Bernard
> PM King-Smith/Poughkeepsie/IBM at IBMUS
> cc
> Grant Grundler <iod00d at hp.com>,
> openib-general at openib.org, Roland
> Dreier <rdreier at cisco.com>
> Subject
> Re: Speeding up IPoIB.
>
>
>
>
>
>
>
>
>
>
> On Thu, Apr 20, 2006 at 09:03:29PM -0400, Bernard King-Smith wrote:
> > Grant> My guess is it's an easier problem to fix SDP than reducing TCP/IP
> > Grant> cache/CPU foot print. I realize only a subset of apps can (or will
> > Grant> try to) use SDP because of setup/config issues. I still believe
> SDP
> > Grant> is useful to a majority of apps without having to recompile them.
> >
> > I agree that reducing any protocol footprint is a very challenging job,
> > however, going to a larger MTU drops the overhead much faster. If IB
> > supported a 60K MTU then the TCP/IP overhead would be 1/30 that of what
> we
> > measure today. Traversng the TCP/IP stack once for a 60K packet is much
> > lower than 30 times using 2000 byte packets for the same amount of data
> > transmitted.
>
> I agree that's effective for workloads which send large messages.
> And that's typical for storage workloads.
> But the world is not just an NFS server. ;)
>
> > Grant> I'm not competent to disagree in detail.
> > Grant> Fabian Tillier and Caitlin Bestler can (and have) addressed this.
> >
> > I would be very interested in any pointers to their work.
>
> They have posted to this forum recently on this topic.
> The archives are here in case you want to look them up:
> http://www.openib.org/contact.html
>
> > This goes back to systems where the system is busy doing nothing,
> generally
> > when waiting for memory or a cache line miss, or I/O to disks. This is
> > where hyperthreading has shown some speedups for benchmarks where
> > previously they were totally CPU limited, and with hyperthreading there
> is
> > a gain.
>
> While there are workloads that benefit, I don't buy the hyperthreading
> argument in general. Co-workers have demonstrate several "normal"
> workloads that don't benefit and are faster with hyperthreading
> disabled.
>
> > The unused cycles are "wait" cycles when something can run if it
> > can get in quickly. You can't get a TCP stack in the wait, but small
> parts
> > of the stackor driver could fit in the other thread. Yes I do
> benchmarking
> > and was skeptical at first.
>
> ok.
>
> thanks,
> grant
>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list