[ofw] [PATCH] checked ipoib_ndis6_cm shutdown race
Smith, Stan
stan.smith at intel.com
Thu Sep 30 09:45:08 PDT 2010
Hi,
Please ignore previous comments about blocking dispatch threads.
Although, comments about bad_list count detecting error case are valid.
stan.
Tzachi Dar wrote:
> It seems to me that although the problem that started this mail
> thread is real, the solution is not in the direction that the
> original writers of the code had in mind.
>
> The real problem here is that one can call ipoib_port_destroy, which
> calls __port_destroying which calls ipoib_port_down.
>
> On the code of ipoib_port_down we move the qp to error but we don't
> wait for all buffers to return.
>
> On the code of __ipoib_cleanup() we have the following loop that
> waits for the receive/send to end:
>
> /* Wait for all sends and receives to get flushed. */
> while( p_port->send_mgr.depth || p_port->recv_mgr.depth )
> cl_thread_suspend( 0 );
>
> I believe that this loop should be moved to right after the modify_qp
> and then we will know that the port is still alive while we are
> waiting for the buffers to be flushed with error.
>
> This will hopefully solve the problem without adding new locks to the
> code.
>
> Thanks
> Tzachi
>
>> -----Original Message-----
>> From: ofw-bounces at lists.openfabrics.org [mailto:ofw-
>> bounces at lists.openfabrics.org] On Behalf Of Smith, Stan
>> Sent: Monday, September 27, 2010 9:25 PM
>> To: Alex Naslednikov
>> Cc: ofw at lists.openfabrics.org
>> Subject: Re: [ofw] [PATCH] checked ipoib_ndis6_cm shutdown race
>>
>> Alex Naslednikov wrote:
>>> Hello Stan,
>>> I liked the idea, but I suppose that one should use take here
>>> spinlock on p_port->obj to ensure that there's no race right after
>>> "if". The code should be like:
>>> } else {
>>> cl_obj_lock( &p_port->obj );
>>> if ( h_cq && p_port->state == IB_QPS_RTS ) { //
>>> increment reference to ensure no one release
>>> the object while work item is queued
>>> ipoib_port_ref( p_port, ref_recv_cb );
>>> IoQueueWorkItem( p_port->pPoWorkItem,
>>> __iopoib_WorkItem, DelayedWorkQueue, p_port);
>>> WorkToDo = FALSE; } else {
>>> WorkToDo = TRUE;
>>> }
>>> cl_obj_unlock( &p_port->obj );
>>> }
>>
>> True, although I hate to see a lock introduced into the speed path.
>> Can you think of a way to keep the lock out of the speed path?
>> Holding the lock over the call to IoQueueWorkItem() is not required;
>> perhaps the following is good enough?
>>
>> cl_obj_lock( &p_port->obj );
>> pstate = p_port->state;
>> cl_obj_unlock( &p_port->obj );
>> if ( h_cq && pstate == IB_QPS_RTS ) {
>> ...
>>
>>
>>> In addition, I found another potentially unsafe place:
>>> Index:
>>> B:/users/xalex/MLNX_WinOF-
>> 2_1_2/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
>>> ===================================================================
>>> ---
>>>
>>> B:/users/xalex/MLNX_WinOF/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
>>> (revision 6513) +++
>>>
>> B:/users/xalex/MLNX_WinOF/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
>> (working
>>> copy) @@ -7002,8 +7002,10 @@ ipoib_set_inactive(
>>> p_port->p_adapter ); __endpt_mgr_reset_all( p_port ); }
>>
>> I'm guessing you intended cl_obj_lock() here instead of unlock?
>>
>>> + cl_obj_unlock( &p_port->obj
>>> ); ASSERT(p_port->state == IB_QPS_INIT ||
>>> p_port->state == IB_QPS_ERROR); p_port->state =
>>> IB_QPS_ERROR; + cl_obj_unlock( &p_port->obj );
>>> KeSetEvent( &p_port->sa_event, EVENT_INCREMENT, FALSE
>>> ); }
>>>
>>> -----Original Message-----
>>> From: Smith, Stan [mailto:stan.smith at intel.com]
>>> Sent: Wednesday, September 22, 2010 6:22 PM
>>> To: Alex Naslednikov
>>> Cc: ofw at lists.openfabrics.org; Uri Habusha
>>> Subject: RE: [PATCH] checked ipoib_ndis6_cm shutdown race
>>>
>>> Hello,
>>> Any thoughts on this patch?
>>>
>>> Smith, Stan wrote:
>>>> Hello,
>>>> During system shutdown I have witnessed a checked ipoib_ndis6_cm
>>>> IO work thread fail:
>>>>
>>>> 1) IO work thread is blocked from running due to scheduling
>>>> priorities beyond the point in time at which port_destroy() wants
>>>> to delete the port object [cl_obj_destroy( &p_port->obj 0)]. The
>>>> port object delete fails (ASSERT obj_ref > 0 fires) due to the
>>>> outstanding port references incurred by remaining posted recv
>>>> buffers. The 1st 128 WorkRequests have been pulled from the CQ by
>>>> __recv_cb_internal(), which then posts an IO work request to
>>>> process the remaining 384 recv work requests. The IO work request
>>>> does not run prior to port_detroy() being called.
>>>>
>>>> 2) The IO thread attempts to run but blows up (BSOD invalid memory
>>>> reference) as port structures required by the IO work thread have
>>>> been free()'ed.
>>>>
>>>> The fix is to recognize the port is not in the IB_QPS_RTS state, do
>>>> not schedule an IO work thread request and continue to pull recv
>>>> work requests from the CQ until empty.
>>>>
>>>> Code snippets:
>>>> } else {
>>>> if ( h_cq && p_port->state == IB_QPS_RTS ) {
>>>> // increment reference to ensure no one
>>>> release the object while work iteam is queued
>>>> ipoib_port_ref( p_port, ref_recv_cb );
>>>> IoQueueWorkItem( p_port->pPoWorkItem,
>>>> __iopoib_WorkItem, DelayedWorkQueue, p_port);
>>>> WorkToDo = FALSE; } else { WorkToDo =
>>>> TRUE; } }
>>>>
>>>> __recv_cb(
>>>> IN const ib_cq_handle_t
>>>> h_cq, IN void
>>>> *cq_context ) { uint32_t recv_cnt;
>>>> boolean_t WorkToDo;
>>>>
>>>> do
>>>> {
>>>> WorkToDo = __recv_cb_internal(h_cq, cq_context,
>>>> &recv_cnt); } while( WorkToDo ); }
>>>>
>>>>
>>>> --- A/ulp/ipoib_ndis6_cm/kernel/ipoib_port.cpp Mon Sep 13
>>>> 15:58:08 2010 +++ B/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp Mon
>>>> Sep 20 08:47:08 2010 @@ -2222,7 +2222,7 @@
>>>> CL_ASSERT( status == IB_SUCCESS );
>>>>
>>>> } else {
>>>> - if (h_cq) {
>>>> + if ( h_cq && p_port->state == IB_QPS_RTS ) {
>>>> // increment reference to ensure no one
>>>> release the object while work iteam is queued
>>>> ipoib_port_ref( p_port, ref_recv_cb );
>>>> IoQueueWorkItem( p_port->pPoWorkItem, __iopoib_WorkItem,
>>>> DelayedWorkQueue, p_port); @@ -2244,9 +2244,13 @@
>>>> IN const ib_cq_handle_t
>>>> h_cq, IN void
>>>> *cq_context ) { - uint32_t recv_cnt;
>>>> + uint32_t recv_cnt;
>>>> + boolean_t WorkToDo;
>>>>
>>>> - __recv_cb_internal(h_cq, cq_context, &recv_cnt); + do +
>>>> { + WorkToDo = __recv_cb_internal(h_cq, cq_context,
>>>> &recv_cnt); + } while( WorkToDo ); }
>>
>> _______________________________________________
>> ofw mailing list
>> ofw at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
More information about the ofw
mailing list