[ofw] [PATCH] checked ipoib_ndis6_cm shutdown race

Thu Sep 30 09:45:08 PDT 2010

Hi,
  Please ignore previous comments about blocking dispatch threads.
Although, comments about bad_list count detecting error case are valid.

stan.

Tzachi Dar wrote:
> It seems to me that although the problem that started this mail
> thread is real, the solution is not in the direction that the
> original writers of the code had in mind.
>
> The real problem here is that one can call ipoib_port_destroy, which
> calls __port_destroying which calls ipoib_port_down.
>
> On the code of ipoib_port_down we move the qp to error but we don't
> wait for all buffers to return.
>
> On the code of __ipoib_cleanup() we have the following loop that
> waits for the receive/send to end:
>
>       /* Wait for all sends and receives to get flushed. */
>       while( p_port->send_mgr.depth || p_port->recv_mgr.depth )
>               cl_thread_suspend( 0 );
>
> I believe that this loop should be moved to right after the modify_qp
> and then we will know that the port is still alive while we are
> waiting for the buffers to be flushed with error.
>
> This will hopefully solve the problem without adding new locks to the
> code.
>
> Thanks
> Tzachi
>
>> -----Original Message-----
>> From: ofw-bounces at lists.openfabrics.org [mailto:ofw-
>> bounces at lists.openfabrics.org] On Behalf Of Smith, Stan
>> Sent: Monday, September 27, 2010 9:25 PM
>> To: Alex Naslednikov
>> Cc: ofw at lists.openfabrics.org
>> Subject: Re: [ofw] [PATCH] checked ipoib_ndis6_cm shutdown race
>>
>> Alex Naslednikov wrote:
>>> Hello Stan,
>>> I liked the idea, but I suppose that one should use take here
>>> spinlock on p_port->obj to ensure that there's no race right after
>>>                  "if". The code should be like:
>>> } else {
>>>             cl_obj_lock( &p_port->obj );
>>>             if ( h_cq && p_port->state == IB_QPS_RTS ) { //
>>>                       increment reference to ensure no one release
>>>                       the object while work item is queued
>>>                       ipoib_port_ref( p_port, ref_recv_cb );
>>>                              IoQueueWorkItem( p_port->pPoWorkItem,
>>>                   __iopoib_WorkItem, DelayedWorkQueue, p_port);
>>>             WorkToDo = FALSE; } else {
>>>                 WorkToDo = TRUE;
>>>             }
>>>                 cl_obj_unlock( &p_port->obj );
>>>        }
>>
>> True, although I hate to see a lock introduced into the speed path.
>> Can you think of a way to keep the lock out of the speed path?
>> Holding the lock over the call to IoQueueWorkItem() is not required;
>> perhaps the following is good enough?
>>
>>         cl_obj_lock( &p_port->obj );
>>         pstate = p_port->state;
>>         cl_obj_unlock( &p_port->obj );
>>         if ( h_cq && pstate == IB_QPS_RTS ) {
>> ...
>>
>>
>>> In addition, I found another potentially unsafe place:
>>> Index:
>>> B:/users/xalex/MLNX_WinOF-
>> 2_1_2/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
>>> ===================================================================
>>> ---
>>>
>>> B:/users/xalex/MLNX_WinOF/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
>>> (revision 6513) +++
>>>
>> B:/users/xalex/MLNX_WinOF/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
>> (working
>>>               copy) @@ -7002,8 +7002,10 @@ ipoib_set_inactive(
>>> p_port->p_adapter ); __endpt_mgr_reset_all( p_port ); }
>>
>> I'm guessing you intended cl_obj_lock() here instead of unlock?
>>
>>> +             cl_obj_unlock( &p_port->obj
>>>               ); ASSERT(p_port->state == IB_QPS_INIT ||
>>>               p_port->state == IB_QPS_ERROR); p_port->state =
>>> IB_QPS_ERROR; +             cl_obj_unlock( &p_port->obj );
>>>               KeSetEvent( &p_port->sa_event, EVENT_INCREMENT, FALSE
>>> );       }
>>>
>>> -----Original Message-----
>>> From: Smith, Stan [mailto:stan.smith at intel.com]
>>> Sent: Wednesday, September 22, 2010 6:22 PM
>>> To: Alex Naslednikov
>>> Cc: ofw at lists.openfabrics.org; Uri Habusha
>>> Subject: RE: [PATCH] checked ipoib_ndis6_cm shutdown race
>>>
>>> Hello,
>>>   Any thoughts on this patch?
>>>
>>> Smith, Stan wrote:
>>>> Hello,
>>>>   During system shutdown I have witnessed a checked ipoib_ndis6_cm
>>>> IO work thread fail:
>>>>
>>>> 1) IO work thread is blocked from running due to scheduling
>>>> priorities beyond the point in time at which port_destroy() wants
>>>> to delete the port object [cl_obj_destroy( &p_port->obj 0)]. The
>>>> port object delete fails (ASSERT obj_ref > 0 fires) due to the
>>>> outstanding port references incurred by remaining posted recv
>>>> buffers. The 1st 128 WorkRequests have been pulled from the CQ by
>>>> __recv_cb_internal(), which then posts an IO work request to
>>>> process the remaining 384 recv work requests. The IO work request
>>>> does not run prior to port_detroy() being called.
>>>>
>>>> 2) The IO thread attempts to run but blows up (BSOD invalid memory
>>>> reference) as port structures required by the IO work thread have
>>>> been free()'ed.
>>>>
>>>> The fix is to recognize the port is not in the IB_QPS_RTS state, do
>>>> not schedule an IO work thread request and continue to pull recv
>>>> work requests from the CQ until empty.
>>>>
>>>> Code snippets:
>>>>       } else {
>>>>               if ( h_cq && p_port->state == IB_QPS_RTS ) {
>>>>                       // increment reference to ensure no one
>>>>                       release the object while work iteam is queued
>>>>                       ipoib_port_ref( p_port, ref_recv_cb );
>>>>                       IoQueueWorkItem( p_port->pPoWorkItem,
>>>>                       __iopoib_WorkItem, DelayedWorkQueue, p_port);
>>>>               WorkToDo = FALSE;               } else { WorkToDo =
>>>> TRUE; } }
>>>>
>>>> __recv_cb(
>>>>       IN              const   ib_cq_handle_t
>>>>       h_cq, IN                              void
>>>>       *cq_context ) { uint32_t        recv_cnt;
>>>>       boolean_t       WorkToDo;
>>>>
>>>>       do
>>>>       {
>>>>               WorkToDo = __recv_cb_internal(h_cq, cq_context,
>>>> &recv_cnt);       } while( WorkToDo ); }
>>>>
>>>>
>>>> --- A/ulp/ipoib_ndis6_cm/kernel/ipoib_port.cpp        Mon Sep 13
>>>> 15:58:08 2010 +++ B/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp   Mon
>>>> Sep 20 08:47:08 2010 @@ -2222,7 +2222,7 @@
>>>>               CL_ASSERT( status == IB_SUCCESS );
>>>>
>>>>       } else {
>>>> -             if (h_cq) {
>>>> +             if ( h_cq && p_port->state == IB_QPS_RTS ) {
>>>>                       // increment reference to ensure no one
>>>>                       release the object while work iteam is queued
>>>>                       ipoib_port_ref( p_port, ref_recv_cb );
>>>> IoQueueWorkItem( p_port->pPoWorkItem, __iopoib_WorkItem,
>>>> DelayedWorkQueue, p_port); @@ -2244,9 +2244,13 @@
>>>>       IN              const   ib_cq_handle_t
>>>>       h_cq, IN                              void
>>>> *cq_context )  { -     uint32_t recv_cnt;
>>>> +     uint32_t        recv_cnt;
>>>> +     boolean_t       WorkToDo;
>>>>
>>>> -     __recv_cb_internal(h_cq, cq_context, &recv_cnt); +     do +
>>>> { +             WorkToDo = __recv_cb_internal(h_cq, cq_context,
>>>>  &recv_cnt); +     } while( WorkToDo ); }
>>
>> _______________________________________________
>> ofw mailing list
>> ofw at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw