[openib-general] [PATCH] splitting IPoIB CQ

Roland Dreier rdreier at cisco.com
Mon Apr 17 13:12:38 PDT 2006


    Shirley> Some tests have been done over mthca and
    Shirley> ehca. Unidirectional stream test, gains up to 15%
    Shirley> throughout with this patch on systems over 4 cpus.
    Shirley> Bidirectional could gain more. People might get different
    Shirley> performance improvement number under different drivers
    Shirley> and cpus. I have attached the patch for who are willing
    Shirley> to run the performance test with different drivers. And
    Shirley> please give your inputs.

Have you ever seen this hurt performance?  It seems that splitting
receives and send CQs will increase the number of events generated and
possibly use more CPU.

Actually, do you have some explanation for why this helps performance?
My intuition would be that it just generates more interrupts for the
same workload.

One specific question:

 > -       struct ib_wc ibwc[IPOIB_NUM_WC];
 > +       struct ib_wc *send_ibwc;
 > +       struct ib_wc *recv_ibwc;

Why are you changing these to be dynamically allocated outside of the
main structure?  Is it to avoid false sharing of cachelines?

It might be better to sort the whole structure so that we have all the
common, read-mostly stuff first, then TX stuff (marked with
____cacheline_aligned_in_smp) and then RX stuff, also marked to be
cacheline aligned.

 - R.



More information about the general mailing list