[openib-general] Re: openib-general Digest, Vol 22, Issue 114
Roland Dreier
rdreier at cisco.com
Wed Apr 19 09:03:09 PDT 2006
Bernard> The assumption you have here is that one CPU is capable
Bernard> of handling the completions without impacting
Bernard> bandwidth. We have seen the opposite in that we end up
Bernard> with one CPU pegged at high throughput. The benefit you
Bernard> are working on is latency will be faster if we handle
Bernard> both send and receive processing off the same
Bernard> thread/interrupt, but you have to balance that with
Bernard> bandwidth limitations. You think 4X has a bandwdith
Bernard> problem using IPoIB, wait till 12X comes out.
I still don't understand why splitting the CQ allows you to use more
than one CPU to handle completions. Both CQ events get handled on the
same CPU -- you just have more overhead in getting to the CQ event
handlers if there are two of them.
Also, why is 12X any worse? With current hardware at least the 4X
link is not the bottleneck anyway.
Bernard> What per CPU utilization do you see on mthca on a
Bernard> multiple CPU machine running peak bandwidth?
I've never really measured it. It's especially tough to account for
interrupt handler time.
- R.
More information about the general
mailing list