[openib-general] [PATCH] splitting IPoIB CQ
Roland Dreier
rdreier at cisco.com
Mon Apr 17 13:12:38 PDT 2006
Shirley> Some tests have been done over mthca and
Shirley> ehca. Unidirectional stream test, gains up to 15%
Shirley> throughout with this patch on systems over 4 cpus.
Shirley> Bidirectional could gain more. People might get different
Shirley> performance improvement number under different drivers
Shirley> and cpus. I have attached the patch for who are willing
Shirley> to run the performance test with different drivers. And
Shirley> please give your inputs.
Have you ever seen this hurt performance? It seems that splitting
receives and send CQs will increase the number of events generated and
possibly use more CPU.
Actually, do you have some explanation for why this helps performance?
My intuition would be that it just generates more interrupts for the
same workload.
One specific question:
> - struct ib_wc ibwc[IPOIB_NUM_WC];
> + struct ib_wc *send_ibwc;
> + struct ib_wc *recv_ibwc;
Why are you changing these to be dynamically allocated outside of the
main structure? Is it to avoid false sharing of cachelines?
It might be better to sort the whole structure so that we have all the
common, read-mostly stuff first, then TX stuff (marked with
____cacheline_aligned_in_smp) and then RX stuff, also marked to be
cacheline aligned.
- R.
More information about the general
mailing list