[ofw] [IPOIB_NDIS6_CM] enhance wc linking loop performance by removing array index calculations
Smith, Stan
stan.smith at intel.com
Mon Aug 9 13:22:52 PDT 2010
Tzachi Dar wrote:
> Have you been actually being able to measure a difference?
>
> I believe that your code is better, but I wander if it really has an
> affect that we can measure.
In the following code sequence modified to use pointers (with Sean's observations)
for( i = 0; i < MAX_SEND_WC; i++ )
wc[i].p_next = &wc[i + 1];
wc[MAX_SEND_WC - 1].p_next = NULL;
for( p_free=wc; p_free < &wc[MAX_SEND_WC - 1]; p_free++ )
p_free->p_next = p_free + 1;
p_free->p_next = NULL;
If the MS WDK compiler optimizations are really 'good', it might optimize the loops to basically the same instruction sequence. I do not believe this to be the case.
The slightly faster execution time is based on the observation that the total number of instructions executed is reduced by skipping the array index arithmetic by use of pointers.
Since these loops live in the Tx & Rx speed paths, every little bit helps.
>
> Thanks
> Tzachi
>
>> -----Original Message-----
>> From: Smith, Stan [mailto:stan.smith at intel.com]
>> Sent: Wednesday, August 04, 2010 8:43 PM
>> To: Tzachi Dar
>> Cc: ofw at lists.openfabrics.org
>> Subject: [IPOIB_NDIS6_CM] enhance wc linking loop performance by
>> removing array index calculations
>>
>>
>> Hello,
>>
>> While reading IPOIB code I noticed a minor speed enhancement in CQ
>> callback routines. When linking WC (work complete) items into a
>> list, by removing the array index calculations in favor of pointer
>> arithmetic the loop will execute slightly faster.
>>
>> Worth a commit?
>>
>> stan.
>>
>> --- A/ulp/ipoib_NDIS6_CM/kernel/ipoib_endpoint.cpp Wed Aug 04
>> 10:30:43 2010 +++ B/ulp/ipoib_NDIS6_CM/kernel/ipoib_endpoint.cpp
>> Wed Aug 04 10:28:59 2010 @@ -888,9 +888,10 @@
>> p_port->p_adapter->p_ifc->modify_qp( p_endpt-
>>> conn.h_send_qp, &mod_attr );
>> p_port->p_adapter->p_ifc->modify_qp( p_endpt-
>>> conn.h_recv_qp, &mod_attr );
>>
>> - for( i = 0; i < MAX_RECV_WC; i++ )
>> - wc[i].p_next = &wc[i + 1];
>> - wc[MAX_RECV_WC - 1].p_next = NULL;
>> + for( p_free_wc=wc; p_free_wc < &wc[MAX_RECV_WC]; p_free_wc++
>> ) + p_free_wc->p_next = p_free_wc + 1;
>> +
>> + (--p_free_wc)->p_next = NULL;
>>
>> do
>> {
>>
>> --- A/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp Wed Aug 04 10:29:33
>> 2010 +++ B/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp Wed Aug 04
>> 10:28:31 2010 @@ -1987,7 +1987,6 @@
>> ib_wc_t wc[MAX_RECV_WC], *p_free,
>> *p_wc; int32_t NBL_cnt, recv_cnt =
>> 0, shortage, discarded; cl_qlist_t
>> done_list, bad_list; - size_t i;
>> ULONG recv_complete_flags = 0;
>> BOOLEAN res;
>>
>> @@ -2017,9 +2016,11 @@
>> cl_qlist_init( &bad_list );
>>
>> ipoib_port_ref( p_port, ref_recv_cb );
>> - for( i = 0; i < MAX_RECV_WC; i++ )
>> - wc[i].p_next = &wc[i + 1];
>> - wc[MAX_RECV_WC - 1].p_next = NULL;
>> +
>> + for( p_free=wc; p_free < &wc[MAX_RECV_WC]; p_free++ )
>> + p_free->p_next = p_free + 1;
>> +
>> + (--p_free)->p_next = NULL;
>>
>> /*
>> * We'll be accessing the endpoint map so take a reference
>> @@ -5769,7 +5770,6 @@ cl_qlist_t
>> done_list; ipoib_endpt_t *p_endpt;
>> ip_stat_sel_t type;
>> - size_t i;
>> NET_BUFFER *p_netbuffer = NULL;
>> ipoib_send_NB_SG *s_buf;
>>
>> @@ -5798,9 +5798,10 @@
>> //cl_qlist_check_validity(&p_port->send_mgr.pending_list);
>> ipoib_port_ref( p_port, ref_send_cb );
>>
>> - for( i = 0; i < MAX_SEND_WC; i++ )
>> - wc[i].p_next = &wc[i + 1];
>> - wc[MAX_SEND_WC - 1].p_next = NULL;
>> + for( p_free=wc; p_free < &wc[MAX_SEND_WC]; p_free++ )
>> + p_free->p_next = p_free + 1;
>> +
>> + (--p_free)->p_next = NULL;
>>
>> do
>> {
More information about the ofw
mailing list