[ofw][patch][IBBUS] crash on re-initialization of remove lock

Leonid Keller leonid at mellanox.co.il
Wed Sep 2 08:56:59 PDT 2009


Thank you. Go ahead.


________________________________

	From: Stan C. Smith [mailto:stan.smith at intel.com] 
	Sent: Wednesday, September 02, 2009 6:50 PM
	To: Leonid Keller; Fab Tillier; Hefty, Sean
	Cc: ofw_list
	Subject: RE: [ofw][patch][IBBUS] crash on re-initialization of
remove lock
	
	
	Leo,
	  I've done some testing of your suggested patch with WHQL
enable/disable tests + standard WinOF load/unload and seen no problems.
	Enclosed are the patches which I suggested to you in previous
email - specifically remove the empty if block.
	On your approval I will commit to branch and trunk.
	 
	stan.
	 
	--- C:/Documents and Settings/scsmith/Local
Settings/Temp/bus_pnp.c-revBASE.svn000.tmp.c Wed Sep 02 08:44:56 2009
	+++ C:/Documents and Settings/scsmith/My
Documents/openIB-windows/SVN/gen1/branches/WOF2-1/core/bus/kernel/bus_pn
p.c Fri Aug 28 16:35:21 2009
	@@ -216,7 +216,6 @@
	    RtlZeroMemory(p_ext, sizeof *p_ext);
	    cl_init_pnp_po_ext( g_ControlDeviceObject, NULL, 
	         NULL, bus_globals.dbg_lvl, NULL, NULL );
	-   IoInitializeRemoveLock( &p_ext->cl_ext.stop_lock, 'dtci', 0,
1000 );
	 
	    /* enable user-mode access to IB stack */
	    BUS_PRINT( BUS_DBG_PNP, ("Remove-n-reCreate dos_name
symlink\n") );
	
	 
	--- C:/Documents and Settings/scsmith/Local
Settings/Temp/cl_pnp_po.c-revBASE.svn000.tmp.c Wed Sep 02 08:43:28 2009
	+++ C:/Documents and Settings/scsmith/My
Documents/openIB-windows/SVN/gen1/branches/WOF2-1/core/complib/kernel/cl
_pnp_po.c Fri Aug 28 16:36:18 2009
	@@ -136,7 +136,9 @@
	 
	  /* Store the pointer to our own device. */
	  p_ext->p_self_do = p_dev_obj;
	+
	  IoInitializeRemoveLock( &p_ext->remove_lock, 'bilc', 15, 1000
);
	+ IoInitializeRemoveLock( &p_ext->stop_lock, 'dtci', 0, 1000 );
	 
	  /* Initialize the PnP states. */
	  p_ext->pnp_state = NotStarted;
	@@ -428,15 +430,6 @@
	  if( NT_SUCCESS( status ) )
	   cl_set_pnp_state( p_ext, Started );
	 
	- /*
	-  * If we get the start request when we're already started,
don't 
	-  * re-initialize the stop lock.
	-  */
	- if( p_ext->last_pnp_state != Started ) {
	-  CL_TRACE( CL_DBG_PNP, p_ext->dbg_lvl,
("IoInitializeRemoveLock: stop_lock %p[\n", &p_ext->stop_lock));
	-  IoInitializeRemoveLock( &p_ext->stop_lock, 'dtci', 0, 1000 );
	- }
	-
	  CL_EXIT( CL_DBG_PNP, p_ext->dbg_lvl );
	  return status;
	 }
	@@ -553,33 +546,6 @@
	   CL_TRACE_EXIT( CL_DBG_PNP, p_ext->dbg_lvl,
	    ("IRP_MN_CANCEL_STOP_DEVICE received in invalid state.\n")
);
	   return status;
	- }
	-
	- if( p_ext->last_pnp_state == Started )
	- {
	-  /*
	-   * Re-initialize the stop lock before rolling back the PnP
	-   * state so that there's no contention while it's
uninitialized.
	-   */
	-  CL_TRACE( CL_DBG_PNP, p_ext->dbg_lvl,
("IoInitializeRemoveLock: stop_lock %p[\n", &p_ext->stop_lock));
	-  IoInitializeRemoveLock( &p_ext->stop_lock, 'dtci', 0, 1000 );
	-#if 0  
	-  // leo: it seems like a bug, because it can never get
released
	-  {
	-  /* 
	-   * Acquire the stop lock to allow releasing and waiting when
stopping.
	-   */
	-   NTSTATUS   status1;
	-   CL_TRACE( CL_DBG_PNP, p_ext->dbg_lvl, ("IoAcquireRemoveLock:
stop_lock %p[\n", &p_ext->stop_lock));
	-   status1 = IoAcquireRemoveLock( &p_ext->stop_lock, NULL );
	-   if( !NT_SUCCESS( status1 ) )
	-   {
	-    CL_TRACE( CL_DBG_ERROR, p_ext->dbg_lvl, 
	-     ("IoAcquireRemoveLock returned %08x. Continue anyway
...\n", status) );
	-   }
	-   CL_TRACE( CL_DBG_PNP, p_ext->dbg_lvl, ("IoAcquireRemoveLock:
stop_lock ]\n"));
	-  }
	-#endif  
	  }
	 
	  /* Return to the previous PnP state. */
	
	 

________________________________

	From: Leonid Keller [mailto:leonid at mellanox.co.il] 
	Sent: Monday, August 17, 2009 5:29 AM
	To: Fab Tillier; Smith, Stan; Hefty, Sean
	Cc: ofw_list
	Subject: FW: [ofw][patch][IBBUS] crash on re-initialization of
remove lock
	
	
	I think this patch should go also into the branch.
	I'll be on vacation 08/19-26, so i'll appreciate a lot if you
could look it through and opine.
	Also if someone could check it, running WHQL tests ...
	 
	There are two Remove Locks in IBBUS code. They are called
'remove_lock' and 'stop_lock'.
	The 'stop_lock' get initialized twice, on device' start and
stop.
	The 'remove_lock' is initialized only once on the start up.
	In my patch I did the same with 'stop_lock'.
	But honestly, I don't quite understand why the author of the
code didn't do this in the first place ...

________________________________

	From: ofw-bounces at lists.openfabrics.org
[mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Leonid Keller
	Sent: Sunday, August 16, 2009 7:46 PM
	To: ofw at lists.openfabrics.org
	Subject: [ofw][patch][IBBUS] crash on re-initialization of
remove lock
	
	
	I was reported a crash upon running "System Common Scenario"
WHQL test with our stack.
	The crash: C4 (0xd7), which means Driver Verifier revealed a
re-initializing of Remove Lock.
	 
	fffff880`01fa73a8 fffff800`0191b3dc : nt!KeBugCheckEx
	fffff880`01fa73b0 fffff800`0192d2e6 : nt!NtShutdownSystem+0x7f0c
	fffff880`01fa73f0 fffff880`02a06974 :
nt!NtShutdownSystem+0x19e16
	fffff880`01fa7450 fffff800`01937c16 : ibbus!cl_pnp+0x1d8
[f:\ribel\mlnx_winof_2.0.5\4335\branches\mlnx_winof_2-0\core\complib\ker
nel\cl_pnp_po.c @ 211]
	fffff880`01fa74b0 fffff800`0193a52a :
nt!NtShutdownSystem+0x24746
	fffff880`01fa7510 fffff800`01937c16 :
nt!NtShutdownSystem+0x2705a
	
	 
	It happens on win2k8 R2.
	One can see that IBAL PnP code, written in 2002 (?) contains in
fact re-initialization of stop Remove  Lock.
	Maybe sometimes it was possible and was not checked by Driver
Verifier or we just haven't come accross it...
	 
	Here is a patch that fixes the problem by eliminating
re-initialization of the stop lock.
	 
	 
	Index: core/bus/kernel/bus_pnp.c
	
===================================================================
	--- core/bus/kernel/bus_pnp.c (revision 4659)
	+++ core/bus/kernel/bus_pnp.c (working copy)
	@@ -216,7 +216,6 @@
	    RtlZeroMemory(p_ext, sizeof *p_ext);
	    cl_init_pnp_po_ext( g_ControlDeviceObject, NULL, 
	         NULL, bus_globals.dbg_lvl, NULL, NULL );
	-   IoInitializeRemoveLock( &p_ext->cl_ext.stop_lock, 'dtci', 0,
1000 );
	 
	    /* enable user-mode access to IB stack */
	    BUS_PRINT( BUS_DBG_PNP, ("Remove-n-reCreate dos_name
symlink\n") );
	Index: core/complib/kernel/cl_pnp_po.c
	
===================================================================
	--- core/complib/kernel/cl_pnp_po.c (revision 4659)
	+++ core/complib/kernel/cl_pnp_po.c (working copy)
	@@ -137,7 +137,8 @@
	  /* Store the pointer to our own device. */
	  p_ext->p_self_do = p_dev_obj;
	  IoInitializeRemoveLock( &p_ext->remove_lock, 'bilc', 15, 1000
);
	-
	+ IoInitializeRemoveLock( &p_ext->stop_lock, 'dtci', 0, 1000 );
	+ 
	  /* Initialize the PnP states. */
	  p_ext->pnp_state = NotStarted;
	  p_ext->last_pnp_state = NotStarted;
	@@ -428,15 +429,6 @@
	  if( NT_SUCCESS( status ) )
	   cl_set_pnp_state( p_ext, Started );
	 
	- /*
	-  * If we get the start request when we're already started,
don't 
	-  * re-initialize the stop lock.
	-  */
	- if( p_ext->last_pnp_state != Started ) {
	-  CL_TRACE( CL_DBG_PNP, p_ext->dbg_lvl,
("IoInitializeRemoveLock: stop_lock %p[\n", &p_ext->stop_lock));
	-  IoInitializeRemoveLock( &p_ext->stop_lock, 'dtci', 0, 1000 );
	- }
	-
	  CL_EXIT( CL_DBG_PNP, p_ext->dbg_lvl );
	  return status;
	 }
	@@ -557,12 +549,6 @@
	 
	  if( p_ext->last_pnp_state == Started )
	  {
	-  /*
	-   * Re-initialize the stop lock before rolling back the PnP
	-   * state so that there's no contention while it's
uninitialized.
	-   */
	-  CL_TRACE( CL_DBG_PNP, p_ext->dbg_lvl,
("IoInitializeRemoveLock: stop_lock %p[\n", &p_ext->stop_lock));
	-  IoInitializeRemoveLock( &p_ext->stop_lock, 'dtci', 0, 1000 );
	 #if 0  
	   // leo: it seems like a bug, because it can never get
released
	   {
	
	 
	 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090902/2d6e54b6/attachment.html>


More information about the ofw mailing list