[openib-general] Re: [PATCH] IPoIB splitting CQ, increase both send/recv poll NUM_WC & interval
Leonid Arsh
leonida at voltaire.com
Tue Apr 25 02:14:09 PDT 2006
You are right - different HCAs and adapters may need specific tuning.
The Mellanox VAPI adapter, handling completions in a tasklet, will
definitely suffer from CQ splitting,
since there may be only one tasklet running across all the CPUs.
The mtcha adapter is a completely different case - the completions are
handled in HW interrupt context.
I'm not familiar with other adapters -- ehca and ipath. Do they notify
completions via HW interrupt / tasklet / soft IRQ ?
Bernard King-Smith wrote:
> Lenoid Arsh wrote:
> Lenoid> Shirley,
>
> Lenoid> some additional information you may be interested:
>
> Lenoid> According to our experience with the Voltaire IPoIB driver,
> Lenoid> splitting CQ harmed the throughput (we checked with the iperf
> Lenoid> application, UDP mode.) Splitting the the CQ caused more
> interrupts,
> Lenoid> context switches and CQ polls.
>
> Interesting results. I think some of Shirley's work reduced the number of
> interrupts on ehca so this is starting to sound like a one size does not
> fit all driver approach. I wonder what Pathscale see if they split the
> completion queues?
>
> Lenoid> Note, the case is rather different from OpenIB mthca, since
> Voltare
> Lenoid> IPoIB is based on the VAPI driver,
> Lenoid> where CQ completions are handled in a tasklet context,
> Lenoid> unlike mthca where CQ completions are handled in the HW
> interrupt
> Lenoid> context.
>
> Another question is what do we do about adapter specific code where each
> adapter type ( ehca, mthca, Voltare and Pathscale ) can all provide better
> performance if adapter specific code and tuning is required?
>
> Lenoid> NAPI gave us some improvement. I think NAPI should improve much
> more
> Lenoid> in mthca, with the HW interrupt CQ completions.
>
> However, I don't believe that NAPI can provide the same benefit for all the
> driver models listed above. It may help in overall interrupt handling, but
> there is probably a need for additional adapter/driver specific tuning.
> Some of these may end up requiring support in the OpenIB stack.
>
> There are many cases not covered by using Netperf, and netpipe that show
> improved performance. These cases are running multiple sockets per
> link/adapter, and the case where you have a larger machine where you have
> multiple adapters. I haven't seen any data recently on duplex traffic, only
> STREAM ( or unidirectional) either.
>
> Bernie King-Smith
> IBM Corporation
> Server Group
> Cluster System Performance
> wombat2 at us.ibm.com (845)433-8483
> Tie. 293-8483 or wombat2 on NOTES
>
> "We are not responsible for the world we are born into, only for the world
> we leave when we die.
> So we have to accept what has gone before us and work to change the only
> thing we can,
> -- The Future." William Shatner
>
>
More information about the general
mailing list