[ofw] [PATCH] checked ipoib_ndis6_cm shutdown race

Tzachi Dar tzachid at mellanox.co.il
Tue Sep 28 10:05:57 PDT 2010


It seems to me that although the problem that started this mail thread is real, the solution is not in the direction that the original writers of the code had in mind.

The real problem here is that one can call ipoib_port_destroy, which calls __port_destroying which calls ipoib_port_down.

On the code of ipoib_port_down we move the qp to error but we don't wait for all buffers to return.

On the code of __ipoib_cleanup() we have the following loop that waits for the receive/send to end:

	/* Wait for all sends and receives to get flushed. */
	while( p_port->send_mgr.depth || p_port->recv_mgr.depth )
		cl_thread_suspend( 0 );

I believe that this loop should be moved to right after the modify_qp and then we will know that the port is still alive while we are waiting for the buffers to be flushed with error.

This will hopefully solve the problem without adding new locks to the code.

Thanks
Tzachi

> -----Original Message-----
> From: ofw-bounces at lists.openfabrics.org [mailto:ofw-
> bounces at lists.openfabrics.org] On Behalf Of Smith, Stan
> Sent: Monday, September 27, 2010 9:25 PM
> To: Alex Naslednikov
> Cc: ofw at lists.openfabrics.org
> Subject: Re: [ofw] [PATCH] checked ipoib_ndis6_cm shutdown race
> 
> Alex Naslednikov wrote:
> > Hello Stan,
> > I liked the idea, but I suppose that one should use take here
> > spinlock on p_port->obj to ensure that there's no race right after
> >                  "if". The code should be like:
> > } else {
> >             cl_obj_lock( &p_port->obj );
> >             if ( h_cq && p_port->state == IB_QPS_RTS ) { // increment
> reference to ensure
> >                       no one release the object while work item is
> >                       queued ipoib_port_ref( p_port, ref_recv_cb );
> >                       IoQueueWorkItem( p_port->pPoWorkItem,
> >                              __iopoib_WorkItem, DelayedWorkQueue,
> p_port);
> >                   WorkToDo = FALSE;
> >             } else {
> >                 WorkToDo = TRUE;
> >             }
> >                 cl_obj_unlock( &p_port->obj );
> >        }
> 
> True, although I hate to see a lock introduced into the speed path.
> Can you think of a way to keep the lock out of the speed path?
> Holding the lock over the call to IoQueueWorkItem() is not required;
> perhaps the following is good enough?
> 
>         cl_obj_lock( &p_port->obj );
>         pstate = p_port->state;
>         cl_obj_unlock( &p_port->obj );
>         if ( h_cq && pstate == IB_QPS_RTS ) {
> ...
> 
> 
> > In addition, I found another potentially unsafe place:
> > Index:
> > B:/users/xalex/MLNX_WinOF-
> 2_1_2/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
> > ===================================================================
> > ---
> > B:/users/xalex/MLNX_WinOF/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
> (revision
> >                       6513) +++
> >
> B:/users/xalex/MLNX_WinOF/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
> (working
> >               copy) @@ -7002,8 +7002,10 @@ ipoib_set_inactive(
> p_port->p_adapter
> > ); __endpt_mgr_reset_all( p_port ); }
> 
> I'm guessing you intended cl_obj_lock() here instead of unlock?
> 
> > +             cl_obj_unlock( &p_port->obj
> >               ); ASSERT(p_port->state == IB_QPS_INIT || p_port->state
> ==
> >               IB_QPS_ERROR); p_port->state = IB_QPS_ERROR;
> > +             cl_obj_unlock( &p_port->obj );
> >               KeSetEvent( &p_port->sa_event, EVENT_INCREMENT, FALSE
> );
> >       }
> >
> > -----Original Message-----
> > From: Smith, Stan [mailto:stan.smith at intel.com]
> > Sent: Wednesday, September 22, 2010 6:22 PM
> > To: Alex Naslednikov
> > Cc: ofw at lists.openfabrics.org; Uri Habusha
> > Subject: RE: [PATCH] checked ipoib_ndis6_cm shutdown race
> >
> > Hello,
> >   Any thoughts on this patch?
> >
> > Smith, Stan wrote:
> >> Hello,
> >>   During system shutdown I have witnessed a checked ipoib_ndis6_cm
> IO
> >> work thread fail:
> >>
> >> 1) IO work thread is blocked from running due to scheduling
> >> priorities beyond the point in time at which port_destroy() wants to
> >> delete the port object [cl_obj_destroy( &p_port->obj 0)]. The port
> >> object delete fails (ASSERT obj_ref > 0 fires) due to the
> outstanding
> >> port references incurred by remaining posted recv buffers. The 1st
> >> 128 WorkRequests have been pulled from the CQ by
> >> __recv_cb_internal(), which then posts an IO work request to process
> >> the remaining 384 recv work requests. The IO work request does not
> >> run prior to port_detroy() being called.
> >>
> >> 2) The IO thread attempts to run but blows up (BSOD invalid memory
> >> reference) as port structures required by the IO work thread have
> >> been free()'ed.
> >>
> >> The fix is to recognize the port is not in the IB_QPS_RTS state, do
> >> not schedule an IO work thread request and continue to pull recv
> work
> >> requests from the CQ until empty.
> >>
> >> Code snippets:
> >>       } else {
> >>               if ( h_cq && p_port->state == IB_QPS_RTS ) {
> >>                       // increment reference to ensure no one
> >>                       release the object while work iteam is queued
> >>                       ipoib_port_ref( p_port, ref_recv_cb );
> >>                       IoQueueWorkItem( p_port->pPoWorkItem,
> >>                       __iopoib_WorkItem, DelayedWorkQueue, p_port);
> >>               WorkToDo = FALSE;               } else { WorkToDo =
> >>       TRUE; } }
> >>
> >> __recv_cb(
> >>       IN              const   ib_cq_handle_t
> >>       h_cq, IN                              void
> >>       *cq_context ) { uint32_t        recv_cnt;
> >>       boolean_t       WorkToDo;
> >>
> >>       do
> >>       {
> >>               WorkToDo = __recv_cb_internal(h_cq, cq_context,
> >> &recv_cnt);       } while( WorkToDo ); }
> >>
> >>
> >> --- A/ulp/ipoib_ndis6_cm/kernel/ipoib_port.cpp        Mon Sep 13
> >> 15:58:08 2010 +++ B/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp   Mon
> >> Sep 20 08:47:08 2010 @@ -2222,7 +2222,7 @@
> >>               CL_ASSERT( status == IB_SUCCESS );
> >>
> >>       } else {
> >> -             if (h_cq) {
> >> +             if ( h_cq && p_port->state == IB_QPS_RTS ) {
> >>                       // increment reference to ensure no one
> >>                       release the object while work iteam is queued
> >>                       ipoib_port_ref( p_port, ref_recv_cb );
> >> IoQueueWorkItem( p_port->pPoWorkItem, __iopoib_WorkItem,
> >> DelayedWorkQueue, p_port); @@ -2244,9 +2244,13 @@
> >>       IN              const   ib_cq_handle_t
> >>       h_cq, IN                              void
> >> *cq_context )  { -     uint32_t recv_cnt;
> >> +     uint32_t        recv_cnt;
> >> +     boolean_t       WorkToDo;
> >>
> >> -     __recv_cb_internal(h_cq, cq_context, &recv_cnt); +     do
> >> +     {
> >> +             WorkToDo = __recv_cb_internal(h_cq, cq_context,
> >>  &recv_cnt); +     } while( WorkToDo ); }
> 
> _______________________________________________
> ofw mailing list
> ofw at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw



More information about the ofw mailing list