[ofw] [PATCH] checked ipoib_ndis6_cm shutdown race

Smith, Stan stan.smith at intel.com
Mon Sep 27 12:25:18 PDT 2010


Alex Naslednikov wrote:
> Hello Stan,
> I liked the idea, but I suppose that one should use take here
> spinlock on p_port->obj to ensure that there's no race right after
>                  "if". The code should be like:
> } else {
>             cl_obj_lock( &p_port->obj );
>             if ( h_cq && p_port->state == IB_QPS_RTS ) { // increment reference to ensure
>                       no one release the object while work item is
>                       queued ipoib_port_ref( p_port, ref_recv_cb );
>                       IoQueueWorkItem( p_port->pPoWorkItem,
>                              __iopoib_WorkItem, DelayedWorkQueue, p_port);
>                   WorkToDo = FALSE;
>             } else {
>                 WorkToDo = TRUE;
>             }
>                 cl_obj_unlock( &p_port->obj );
>        }

True, although I hate to see a lock introduced into the speed path.
Can you think of a way to keep the lock out of the speed path?
Holding the lock over the call to IoQueueWorkItem() is not required; perhaps the following is good enough?

        cl_obj_lock( &p_port->obj );
        pstate = p_port->state;
        cl_obj_unlock( &p_port->obj );
        if ( h_cq && pstate == IB_QPS_RTS ) {
...


> In addition, I found another potentially unsafe place:
> Index:
> B:/users/xalex/MLNX_WinOF-2_1_2/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
> ===================================================================
> ---
> B:/users/xalex/MLNX_WinOF/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp    (revision
>                       6513) +++
>                       B:/users/xalex/MLNX_WinOF/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp      (working
>               copy) @@ -7002,8 +7002,10 @@ ipoib_set_inactive( p_port->p_adapter
> ); __endpt_mgr_reset_all( p_port ); }

I'm guessing you intended cl_obj_lock() here instead of unlock?

> +             cl_obj_unlock( &p_port->obj
>               ); ASSERT(p_port->state == IB_QPS_INIT || p_port->state ==
>               IB_QPS_ERROR); p_port->state = IB_QPS_ERROR;
> +             cl_obj_unlock( &p_port->obj );
>               KeSetEvent( &p_port->sa_event, EVENT_INCREMENT, FALSE );
>       }
>
> -----Original Message-----
> From: Smith, Stan [mailto:stan.smith at intel.com]
> Sent: Wednesday, September 22, 2010 6:22 PM
> To: Alex Naslednikov
> Cc: ofw at lists.openfabrics.org; Uri Habusha
> Subject: RE: [PATCH] checked ipoib_ndis6_cm shutdown race
>
> Hello,
>   Any thoughts on this patch?
>
> Smith, Stan wrote:
>> Hello,
>>   During system shutdown I have witnessed a checked ipoib_ndis6_cm IO
>> work thread fail:
>>
>> 1) IO work thread is blocked from running due to scheduling
>> priorities beyond the point in time at which port_destroy() wants to
>> delete the port object [cl_obj_destroy( &p_port->obj 0)]. The port
>> object delete fails (ASSERT obj_ref > 0 fires) due to the outstanding
>> port references incurred by remaining posted recv buffers. The 1st
>> 128 WorkRequests have been pulled from the CQ by
>> __recv_cb_internal(), which then posts an IO work request to process
>> the remaining 384 recv work requests. The IO work request does not
>> run prior to port_detroy() being called.
>>
>> 2) The IO thread attempts to run but blows up (BSOD invalid memory
>> reference) as port structures required by the IO work thread have
>> been free()'ed.
>>
>> The fix is to recognize the port is not in the IB_QPS_RTS state, do
>> not schedule an IO work thread request and continue to pull recv work
>> requests from the CQ until empty.
>>
>> Code snippets:
>>       } else {
>>               if ( h_cq && p_port->state == IB_QPS_RTS ) {
>>                       // increment reference to ensure no one
>>                       release the object while work iteam is queued
>>                       ipoib_port_ref( p_port, ref_recv_cb );
>>                       IoQueueWorkItem( p_port->pPoWorkItem,
>>                       __iopoib_WorkItem, DelayedWorkQueue, p_port);
>>               WorkToDo = FALSE;               } else { WorkToDo =
>>       TRUE; } }
>>
>> __recv_cb(
>>       IN              const   ib_cq_handle_t
>>       h_cq, IN                              void
>>       *cq_context ) { uint32_t        recv_cnt;
>>       boolean_t       WorkToDo;
>>
>>       do
>>       {
>>               WorkToDo = __recv_cb_internal(h_cq, cq_context,
>> &recv_cnt);       } while( WorkToDo ); }
>>
>>
>> --- A/ulp/ipoib_ndis6_cm/kernel/ipoib_port.cpp        Mon Sep 13
>> 15:58:08 2010 +++ B/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp   Mon
>> Sep 20 08:47:08 2010 @@ -2222,7 +2222,7 @@
>>               CL_ASSERT( status == IB_SUCCESS );
>>
>>       } else {
>> -             if (h_cq) {
>> +             if ( h_cq && p_port->state == IB_QPS_RTS ) {
>>                       // increment reference to ensure no one
>>                       release the object while work iteam is queued
>>                       ipoib_port_ref( p_port, ref_recv_cb );
>> IoQueueWorkItem( p_port->pPoWorkItem, __iopoib_WorkItem,
>> DelayedWorkQueue, p_port); @@ -2244,9 +2244,13 @@
>>       IN              const   ib_cq_handle_t
>>       h_cq, IN                              void
>> *cq_context )  { -     uint32_t recv_cnt;
>> +     uint32_t        recv_cnt;
>> +     boolean_t       WorkToDo;
>>
>> -     __recv_cb_internal(h_cq, cq_context, &recv_cnt); +     do
>> +     {
>> +             WorkToDo = __recv_cb_internal(h_cq, cq_context,
>>  &recv_cnt); +     } while( WorkToDo ); }




More information about the ofw mailing list