[ofw] RE: OFED 1.3/WinOF 1.1/Win2k3R2X64 BSOD

Leonid Keller leonid at mellanox.co.il
Mon Jun 30 07:18:14 PDT 2008


Thanks, but i meant to ask, whether this crash looks like the one,
you've solved in 1223 ?


________________________________

	From: Eleanor Witiak [mailto:eleanor.witiak at qlogic.com] 
	Sent: Monday, June 30, 2008 4:40 PM
	To: Leonid Keller; AndInc at aol.com; sean.hefty at intel.com;
ofw at lists.openfabrics.org
	Subject: RE: OFED 1.3/WinOF 1.1/Win2k3R2X64 BSOD
	
	

	Yes, the patch did come after the 1.1 release.  The patch
revision # is 1223; the affected files are srp_connection.c and
srp_session.c.

	 

	Eleanor

	 

	
________________________________


	From: Leonid Keller [mailto:leonid at mellanox.co.il] 
	Sent: Monday, June 30, 2008 4:34 AM
	To: AndInc at aol.com; sean.hefty at intel.com;
ofw at lists.openfabrics.org; Eleanor Witiak
	Subject: RE: OFED 1.3/WinOF 1.1/Win2k3R2X64 BSOD

	 

	a) don't know;

	b) may be caused by a);

	c) may be caused by b).

	 

	A very important patch of Eleanor (WinOF 1223), preventing BSOD
upon sudden srpt disconnection, has come after closing the release.

	Eleanor, could you check whether it's the case.

	 

	Here is some more information, based on the sent minidumps:

	 

	1: kd> !analyze -v

	BAD_POOL_CALLER (c2)
	The current thread is making a bad pool request.  Typically this
is at a bad IRQL level or double freeing the same allocation, etc.
	Arguments:
	Arg1: 0000000000000007, Attempt to free pool which was already
freed
	Arg2: 000000000000121a, (reserved)
	Arg3: 00000000012b0011, Memory contents of the pool block
	Arg4: fffffadf99483c50, Address of the block of pool being
deallocated

	 

	Debugging Details:
	------------------

	 

	
	POOL_ADDRESS:  fffffadf99483c50 

	 

	FREED_POOL_TAG:  priv

	 

	BUGCHECK_STR:  0xc2_7_priv

	 

	CUSTOMER_CRASH_COUNT:  1

	 

	DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

	 

	PROCESS_NAME:  System

	 

	CURRENT_IRQL:  0

	 

	LAST_CONTROL_TRANSFER:  from fffff800011aa769 to
fffff8000102e950

	 

	STACK_TEXT:  
	fffffadf`90d7bbc8 fffff800`011aa769 : 00000000`000000c2
00000000`00000007 00000000`0000121a 00000000`012b0011 : nt!KeBugCheckEx
	fffffadf`90d7bbd0 fffffadf`8f554621 : fffffadf`99483c50
00000000`00000080 fffffadf`99483c50 00000000`00000080 :
nt!ExFreePoolWithTag+0x401
	fffffadf`90d7bc90 fffffadf`8f51f568 : fffffadf`9c813c00
fffffadf`9bddd3e8 fffffadf`99483c78 fffffadf`9bddd3c8 :
ibbus!async_destroy_cb+0x171
[d:\openib-windows-svn\1177\gen1\trunk\core\al\al_common.c @ 686]
	fffffadf`90d7bce0 fffffadf`8f521a1d : fffffadf`9c8764e0
fffffadf`9bddd2b0 fffffadf`9bed0040 fffff800`011b5500 :
ibbus!__cl_async_proc_worker+0x98
[d:\openib-windows-svn\1177\gen1\trunk\core\complib\cl_async_proc.c @
153]
	fffffadf`90d7bd10 fffffadf`8f522108 : 00000000`00000000
fffffadf`9c8764e0 fffffadf`9c8764e0 fffff800`011b5500 :
ibbus!__cl_thread_pool_routine+0x4d
[d:\openib-windows-svn\1177\gen1\trunk\core\complib\cl_threadpool.c @
66]
	fffffadf`90d7bd40 fffff800`0124b972 : 00000000`00000000
fffffadf`9beaf040 fffffadf`9beaf040 fffffadf`9c168bf0 :
ibbus!__thread_callback+0x28
[d:\openib-windows-svn\1177\gen1\trunk\core\complib\kernel\cl_thread.c @
49]
	fffffadf`90d7bd70 fffff800`010202d6 : fffff800`011b1180
fffffadf`9bed0040 fffff800`011b5500 fffffadf`9c8b81c0 :
nt!PspSystemThreadStartup+0x3e
	fffffadf`90d7bdd0 00000000`00000000 : 00000000`00000000
00000000`00000000 00000000`00000000 00000000`00000000 :
nt!KxStartSystemThread+0x16

	 

	FOLLOWUP_IP: 
	ibbus!async_destroy_cb+171
[d:\openib-windows-svn\1177\gen1\trunk\core\al\al_common.c @ 686]

	SYMBOL_STACK_INDEX:  2

	 

	SYMBOL_NAME:  ibbus!async_destroy_cb+171

	 

		
________________________________


		From: AndInc at aol.com [mailto:AndInc at aol.com] 
		Sent: Friday, June 27, 2008 2:14 AM
		To: sean.hefty at intel.com; Leonid Keller;
ofw at lists.openfabrics.org
		Subject: OFED 1.3/WinOF 1.1/Win2k3R2X64 BSOD

		A simple sequential/random IOMeter script of small block
writes produces a BSOD in this environment. Trace is below, very
repeatable, two similar failures in the trace. Any clues about what's
causing the (a) error (b) disconnect and (c) BSOD?

		 

		Thanks,

		 

		Mike Anderson

		 

		[15513.043769] local QP operation err (QPN 0c004a, WQE
index 39b8, vendor syndrome 6f, opcode = 5e)
		[15513.043777] CQE contents 000c004a 00000000 00000000
00000000 00000000 00000000 39b86f02 0000005e
		[15513.043779] ib_srpt: failed send status= 2
		[15513.043783] ib_srpt: failed send status= 5
		[15513.043786] ib_srpt: failed send status= 5
		[15513.043801] ib_srpt: failed send status= 5
		[15513.043851] ib_srpt: failed send status= 5
		[15513.043855] ib_srpt: failed send status= 5
		[15513.043857] ib_srpt: failed send status= 5
		[15513.043860] ib_srpt: failed send status= 5
		[15513.043873] ib_srpt: QP event 16 on cm_id=
ffff8100ba389800 sess_name= 0x0002c9030000a50c0002c9030000a3ec state= 1
		[15513.043877] ib_srpt: Schedule CM_DISCONNECT_WORK
		[15513.043967] ib_srpt: srpt_cm_drep_recv[1636] cm_id=
ffff8100ba389800
		[15513.044220] ib_srpt: srpt_release_channel: Release
sess= ffff8101c27d3cf0 sess_name= 0x0002c9030000a50c0002c9030000a3ec
active_cmd= 7
		[15513.044223] [6160]:
scst_unregister_session:4639:Unregistering session ffff8101c27d3cf0
(wait 0)
		[15739.551108] ib_srpt: ASYNC event= 10 on device=
mlx4_0
		[15831.623484] ib_srpt: ASYNC event= 17 on device=
mlx4_0
		[15831.624195] ib_srpt: ASYNC event= 11 on device=
mlx4_0
		[15831.624400] ib_srpt: ASYNC event= 11 on device=
mlx4_0
		[15831.636997] ib_srpt: ASYNC event= 9 on device= mlx4_0
		[15833.127349] ib_srpt: Host login
i_port_id=0x2c9030000a50c:0x2c9030000a3ec
t_port_id=0x2c9030000a50c:0x2c9030000a50c it_iu_len=996
		[15833.128607] ib_srpt: srpt_create_ch_ib[1228] max_cqe=
4095 max_sge= 29 cm_id= ffff8101b38b0a00
		[15833.128927] [6823]: scst:
scst_init_session:4509:Using security group "Default" for initiator
"0x0002c9030000a50c0002c9030000a3ec"
		[15833.128938] [6823]: scst_init_session:4512:Assigning
session ffff810100467c30 to acg Default
		[15833.128951] [6823]:
scst_alloc_add_tgt_dev:405:host=9, channel=0, id=0, lun=0, SCST lun=0
		[15833.128958] [6823]: scst_alloc_set_UA:2486:Adding new
UA to tgt_dev ffff8101c953de60
		[15833.128980] ib_srpt: Establish connection sess=
ffff810100467c30 name= 0x0002c9030000a50c0002c9030000a3ec cm_id=
ffff8101b38b0a00
		[15833.132787] [6818]: scst:
scst_set_pending_UA:2420:Setting pending UA cmd ffff810100ba66d0
		[15841.612022] ib_srpt: ASYNC event= 11 on device=
mlx4_0
		[16046.074918] igb: eth1: igb_watchdog_task: NIC Link is
Up 100 Mbps Full Duplex, Flow Control: RX/TX
		[16056.648672] eth1: no IPv6 routers present
		[17209.196025] local QP operation err (QPN 0e004a, WQE
index 3d40, vendor syndrome 6f, opcode = 5e)
		[17209.196032] CQE contents 000e004a 00000000 00000000
00000000 00000000 00000000 3d406f02 000000de
		[17209.196033] ib_srpt: failed send status= 2
		[17209.196037] ib_srpt: failed send status= 5
		[17209.196040] ib_srpt: failed send status= 5
		[17209.196044] ib_srpt: failed send status= 5
		[17209.196069] ib_srpt: QP event 16 on cm_id=
ffff8101b38b0a00 sess_name= 0x0002c9030000a50c0002c9030000a3ec state= 1
		[17209.196074] ib_srpt: Schedule CM_DISCONNECT_WORK
		[17209.196078] ib_srpt: srpt_xmit_response[1960] tag=
10296991 channel in bad state 2
		[17209.196083] ib_srpt: failed send status= 5
		[17209.196089] [6820]: scst:
scst_xmit_response:2590:***ERROR*** Target driver ib_srpt
xmit_response() returned fatal error
		[17209.196099] ib_srpt: srpt_xmit_response[1960] tag=
10296992 channel in bad state 2
		[17209.196104] [6819]: scst:
scst_xmit_response:2590:***ERROR*** Target driver ib_srpt
xmit_response() returned fatal error
		[17209.196157] ib_srpt: srpt_xmit_response[1960] tag=
10296993 channel in bad state 2
		[17209.196160] [6817]: scst:
scst_xmit_response:2590:***ERROR*** Target driver ib_srpt
xmit_response() returned fatal error
		[17209.196173] ib_srpt: srpt_cm_drep_recv[1636] cm_id=
ffff8101b38b0a00
		[17209.196179] ib_srpt: srpt_xmit_response[1960] tag=
10296994 channel in bad state 2
		[17209.196182] [6814]: scst:
scst_xmit_response:2590:***ERROR*** Target driver ib_srpt
xmit_response() returned fatal error
		[17209.196265] ib_srpt: srpt_xmit_response[1960] tag=
10296995 channel in bad state 2
		[17209.196269] [6818]: scst:
scst_xmit_response:2590:***ERROR*** Target driver ib_srpt
xmit_response() returned fatal error
		[17209.196277] ib_srpt: srpt_xmit_response[1960] tag=
10296996 channel in bad state 2
		[17209.196278] [6818]: scst:
scst_xmit_response:2590:***ERROR*** Target driver ib_srpt
xmit_response() returned fatal error
		[17209.196308] ib_srpt: srpt_xmit_response[1960] tag=
10296997 channel in bad state 2
		[17209.196309] [6815]: scst:
scst_xmit_response:2590:***ERROR*** Target driver ib_srpt
xmit_response() returned fatal error
		[17209.197269] ib_srpt: srpt_release_channel: Release
sess= ffff810100467c30 sess_name= 0x0002c9030000a50c0002c9030000a3ec
active_cmd= 3
		[17209.197272] [6158]:
scst_unregister_session:4639:Unregistering session ffff810100467c30
(wait 0)
		linux-gen24:~ #  

		
		
		

		
________________________________


		Gas prices getting you down? Search AOL Autos for
fuel-efficient used cars
<http://autos.aol.com/used?ncid=aolaut00050000000007> .

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20080630/dc942315/attachment.html>


More information about the ofw mailing list