[ofw] [IPOIB_NDIS6_CM] enhance wc linking loop performance by removing array index calculations

Mon Aug 16 17:49:26 PDT 2010

Hefty, Sean wrote:
>> for( i = 0; i < MAX_SEND_WC; i++ )
>>      wc[i].p_next = &wc[i + 1];
>> wc[MAX_SEND_WC - 1].p_next = NULL;
>>
>> for( p_free=wc; p_free < &wc[MAX_SEND_WC - 1]; p_free++ )
>>      p_free->p_next = p_free + 1;
>> p_free->p_next = NULL;
>>
>> If the MS WDK compiler optimizations are really 'good', it might
>> optimize the loops to basically the same instruction sequence.  I do
>> not believe this to be the case.
>>
>> The slightly faster execution time is based on the observation that
>> the total number of instructions executed is reduced by skipping the
>> array index arithmetic by use of pointers.
>
> FWIW, out of curiosity, I ran a test on my desktop using the VS
> compiler to compare code that used the above loops.  I set
> MAX_SEND_WC to 100, then called each loop 10,000,000 times.  The
> results showed that the bottom loop was barely faster most of the
> time.  Sometimes the first loop ended up faster, but overall loop 2
> won.
>
> It wasn't a great test, since I had a bunch of junk running on my
> system, but it seemed good enough to me that this is worthwhile.
> Though I agree with Tzachi that we should eventually just move to
> using arrays.  This is already there in user space.
>
> - Sean

Interesting outcome.  For interest comparisons, would you send me your code as I'd like to verify similar results using the WDK compiler with options set as they are in IPOIB.
I'm not yet convinced the VS compiler == WDK compiler; I tend to lean towards the VS being the 'better' compiler.
Yes I realize they have common roots.....would like to see a comparison and this looks to be an opportunity.

Indeed, WC arrays are the 'real' answer.
Would the IF change be handled via a new ib_poll_cq() interface and/or a change to the WC struct?
Any proposals on the table?