[ofw] RE: correctly handle IOU manager destruction without ASSERT firing.

Leonid Keller leonid at mellanox.co.il
Thu Apr 16 07:28:41 PDT 2009


OK with me. 

> -----Original Message-----
> From: Smith, Stan [mailto:stan.smith at intel.com] 
> Sent: Wednesday, April 15, 2009 2:47 AM
> To: Leonid Keller
> Cc: ofw at lists.openfabrics.org
> Subject: correctly handle IOU manager destruction without 
> ASSERT firing.
> 
> 
> Hello Leonid,
>   While testing your ibbus.sys ControlDeviceObject code, on 
> HCA disable attempts I kept hitting an al_common.c ASSERT() 
> which was triggered by calling deref_al_obj( 
> &p_ext->h_ca->obj ); @ line #423 in bus_iou_mgr.c:free_iou_mgr().
> The ASSERT() fired due to the fact the AL object 
> p_ext->h_ca->obj was in an incorrect CL_STATE; already in the 
> CL_DESTROYED state, ref_cnt == 0, already destroyed?
> 
> The issue was traced back to iou_mgr_iou_remove() where the 
> p_ext->h_ca field was never set to NULL after the last IOU 
> PDO removal.  A non-NULL h_ca field allowed free_iou_mgr() to 
> attempt an extra deref_al_obj() on an object which had 
> already been destroyed by virtue of it's ref_cnt going to 
> zero from the deref_al_obj() call in iou_mgr_remove().
> 
> Fix - NULL out the h_ca field when removing the last IOU PDO.
> Additionally, extraenous debug code is removed.
> 
> If you approve, I will svn commit.
> 
> Thank you,
> 
> Stan.
> 
> 
> Signed off by stan.smith at intel.com
> 
> diff U3 C:/Documents and Settings/scsmith/Local 
> Settings/Temp/bus_iou_mgr.c-revBASE.svn000.tmp.c C:/Documents 
> and Settings/scsmith/My 
> Documents/openIB-windows/SVN/gen1/trunk/core/bus/kernel/bus_iou_mgr.c
> --- bus_iou_mgr.c-revBASE.svn000.tmp.c  Tue Apr 14 16:22:16 2009
> +++ core/bus/kernel/bus_iou_mgr.c       Tue Apr 14 16:02:11 2009
> @@ -426,11 +426,6 @@
>                                         p_bfi->whoami, 
> p_ext->cl_ext.vfptr_pnp_po->identity,
>                                         
> p_ext->cl_ext.p_self_do, p_ext ) );
> 
> -               BUS_TRACE( BUS_DBG_PNP,("%s 
> p_ext->h_ca->obj.state %d ref_cnt %d\n",
> -                                       p_bfi->whoami,
> -                                       p_ext->h_ca->obj.state,
> -                                       p_ext->h_ca->obj.ref_cnt));
> -
>                 IoDeleteDevice( p_ext->cl_ext.p_self_do );
>         }
> 
> @@ -868,7 +863,8 @@
>                         p_ext->cl_ext.vfptr_pnp_po->identity, 
> p_ext->cl_ext.p_self_do,
>                         p_ext, p_ext->b_present,
>                         p_ext->b_reported_missing, 
> p_ext->b_hibernating ) );
> -               goto hca_deref;
> +               deref_al_obj( &p_ext->h_ca->obj );
> +               goto xit;
>         }
> 
>         p_ext->b_present = FALSE;
> @@ -890,9 +886,10 @@
>         /* free PNP context */
>         cl_free( p_ctx );
>         p_pnp_rec->pnp_rec.context = NULL;
> -
> -hca_deref:
>         deref_al_obj( &p_ext->h_ca->obj );
> +       p_ext->h_ca = NULL;     // for free_iou_mgr()
> +
> +xit:
>         cl_mutex_release( &gp_iou_mgr->pdo_mutex );
> 
>         BUS_EXIT( BUS_DBG_PNP );
> 



More information about the ofw mailing list