[openib-general] Re: openib-general Digest, Vol 22, Issue 114

Roland Dreier rdreier at cisco.com
Wed Apr 19 09:03:09 PDT 2006


    Bernard> The assumption you have here is that one CPU is capable
    Bernard> of handling the completions without impacting
    Bernard> bandwidth. We have seen the opposite in that we end up
    Bernard> with one CPU pegged at high throughput. The benefit you
    Bernard> are working on is latency will be faster if we handle
    Bernard> both send and receive processing off the same
    Bernard> thread/interrupt, but you have to balance that with
    Bernard> bandwidth limitations. You think 4X has a bandwdith
    Bernard> problem using IPoIB, wait till 12X comes out.

I still don't understand why splitting the CQ allows you to use more
than one CPU to handle completions.  Both CQ events get handled on the
same CPU -- you just have more overhead in getting to the CQ event
handlers if there are two of them.

Also, why is 12X any worse?  With current hardware at least the 4X
link is not the bottleneck anyway.

    Bernard> What per CPU utilization do you see on mthca on a
    Bernard> multiple CPU machine running peak bandwidth?

I've never really measured it.  It's especially tough to account for
interrupt handler time.

 - R.



More information about the general mailing list