[ofw] winof 2.0.2: crash in ibbus.sys when running whql tests onmlx4hca

James Yang jyang at xsigo.com
Thu Feb 19 09:01:17 PST 2009


I have seen the similar problem before, and I believe that this is due to IB code have lots of functions defined as paged. There is a possibility that disk controller has been turned off, thus paged code won't be able excuted any more. This has much more chance to happen for SAN boot using IB.
 
I've suggested to find out these page-able code and make them non-pagable in another email thread "Support boot device in IB stack?". Maybe we should just make all code non-pagable, the code size may not so important for server, compared to stability.
 
-James

________________________________

From: ofw-bounces at lists.openfabrics.org on behalf of Anatoly Greenblatt
Sent: Thu 2/19/2009 6:56 AM
To: ofw at lists.openfabrics.org
Subject: [ofw] winof 2.0.2: crash in ibbus.sys when running whql tests onmlx4hca



Hi All,

 

Please examine the crash analysis below.

 

Regards,

Anatoly.

 

 

*******************************************************************************

*                                                                             *

*                        Bugcheck Analysis                                    *

*                                                                             *

*******************************************************************************

 

IRQL_NOT_LESS_OR_EQUAL (a)

An attempt was made to access a pageable (or completely invalid) address at an

interrupt request level (IRQL) that is too high.  This is usually

caused by drivers using improper addresses.

If a kernel debugger is available get the stack backtrace.

Arguments:

Arg1: 0000000000000088, memory referenced

Arg2: 0000000000000002, IRQL

Arg3: 0000000000000001, bitfield :

          bit 0 : value 0 = read operation, 1 = write operation

          bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)

Arg4: fffff80001675163, address which referenced memory

 

Debugging Details:

------------------

 

 

WRITE_ADDRESS:  0000000000000088 

 

CURRENT_IRQL:  2

 

FAULTING_IP: 

nt!KeAcquireSpinLockRaiseToDpc+13

fffff800`01675163 f0480fba2900    lock bts qword ptr [rcx],0

 

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

 

BUGCHECK_STR:  0xA

 

PROCESS_NAME:  System

 

TRAP_FRAME:  fffffa6001be7400 -- (.trap 0xfffffa6001be7400)

NOTE: The trap frame does not contain all registers.

Some register values may be zeroed or incorrect.

rax=0000000000000002 rbx=fffff8800505d000 rcx=0000000000000088

rdx=fffffa6003e7c060 rsi=fffffa6001be7670 rdi=fffffa60011dc3ca

rip=fffff80001675163 rsp=fffffa6001be7590 rbp=fffffa6003ed8110

 r8=fffffa6001be7740  r9=0000000000000001 r10=0000000000000000

r11=fffffa8034ff1dc0 r12=0000000000000000 r13=0000000000000000

r14=0000000000000000 r15=0000000000000000

iopl=0         nv up ei ng nz na po nc

nt!KeAcquireSpinLockRaiseToDpc+0x13:

fffff800`01675163 f0480fba2900    lock bts qword ptr [rcx],0 ds:7558:0088=????????????????

Resetting default scope

 

LOCK_ADDRESS:  fffff80001810c20 -- (!locks fffff80001810c20)

 

Resource @ nt!PiEngineLock (0xfffff80001810c20)    Exclusively owned

    Contention Count = 8

    NumberOfExclusiveWaiters = 1

     Threads: fffffa803028abb0-01<*> 

     Threads Waiting On Exclusive Access:

              fffffa803028c720       

 

1 total locks, 1 locks currently held

 

PNP_TRIAGE: 

          Lock address  : 0xfffff80001810c20

          Thread Count  : 1

          Thread address: 0xfffffa803028abb0

          Thread wait   : 0x154c

 

LAST_CONTROL_TRANSFER:  from fffff8000166a12e to fffff8000166a390

 

STACK_TEXT:  

fffffa60`01be72b8 fffff800`0166a12e : 00000000`0000000a 00000000`00000088 00000000`00000002 00000000`00000001 : nt!KeBugCheckEx

fffffa60`01be72c0 fffff800`0166900b : 00000000`00000001 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiBugCheckDispatch+0x6e

fffffa60`01be7400 fffff800`01675163 : 6f662045`43495645 206b6361`74732072 4f445020`68746977 ff000a70`25783020 : nt!KiPageFault+0x20b

fffffa60`01be7590 fffffa60`03e9f9a1 : fffff800`01789680 fffffa60`017d8180 fffffa80`3028abb0 00000000`00000001 : nt!KeAcquireSpinLockRaiseToDpc+0x13

fffffa60`01be75c0 fffffa60`03e69213 : fffffa80`32683720 fffffa80`326835d0 fffffa80`3028ac00 fffffa80`3267f590 : ibbus!ib_deregister_ca+0xa1 [c:\work\winof.2.0.2\sources\core\al\kernel\al_mgr.c @ 315]

fffffa60`01be75f0 fffffa60`03e6952e : fffffa80`32683720 00000000`00000000 fffff800`01789cf0 fffffa80`3028abb0 : mlx4_hca!__hca_deregister+0xa3 [c:\work\winof.2.0.2\sources\hw\mlx4\kernel\hca\drv.c @ 1466]

fffffa60`01be7630 fffffa60`03e7c0df : fffffa80`326835d0 445f504f`54535f4e 6f662045`43495645 206b6361`74732072 : mlx4_hca!__hca_release_resources+0x9e [c:\work\winof.2.0.2\sources\hw\mlx4\kernel\hca\drv.c @ 1666]

fffffa60`01be7680 fffffa60`03e7dbbc : fffffa80`326835d0 fffffa80`34ff1c60 fffffa60`01be7740 fffffa60`03ed4960 : mlx4_hca!hca_release_resources+0x7f [c:\work\winof.2.0.2\sources\hw\mlx4\kernel\hca\drv.c @ 1726]

fffffa60`01be76e0 fffffa60`011dc563 : fffffa80`00000001 01c991d6`8794c658 00000000`00000004 ffffffff`76697270 : mlx4_hca!cl_pnp+0xe4 [c:\work\winof.2.0.2\sources\core\complib\kernel\cl_pnp_po.c @ 245]

fffffa60`01be7740 fffffa60`011e209c : fffffa80`3267d830 fffffa80`34ff1c60 fffffa80`32684e90 fffffa80`32684d40 : pnpfiltr+0x1563

fffffa60`01be77a0 fffffa60`011e1c84 : fffffa80`34ff1e08 fffffa60`011db000 fffffa80`32684d40 fffffa80`32684bb0 : pnpfiltr+0x709c

fffffa60`01be77d0 fffffa60`03ede826 : fffffa80`32684bb0 fffffa80`32684a60 fffffa80`34ff1c60 00000000`00000000 : pnpfiltr+0x6c84

fffffa60`01be7800 fffff800`0187f72e : 00000000`00000001 fffffa80`34ff1c60 00000000`00000000 fffffa60`01be7878 : ibbus!cl_pnp+0x5fe [c:\work\winof.2.0.2\sources\core\complib\kernel\cl_pnp_po.c @ 403]

fffffa60`01be7860 fffff800`019af5c9 : 00000000`00000004 00000000`00000000 00000000`00000000 fffffa80`32675950 : nt!IopSynchronousCall+0x10a

fffffa60`01be78d0 fffff800`019b02a3 : fffffa80`32672de0 fffff880`00632a18 fffffa80`3075c960 fffffa60`00bd7c92 : nt!IopQueryReconfiguration+0xa9

fffffa60`01be7960 fffff800`019b02db : fffffa80`32672de0 fffff880`0505d000 fffff780`00000014 00000000`00000000 : nt!PnpStopDeviceNode+0x23

fffffa60`01be7990 fffff800`019b02db : fffffa80`3075a730 fffff780`00000014 00000000`ffffffff fffff880`0505d000 : nt!PnpStopDeviceSubtree+0x1b

fffffa60`01be79c0 fffff800`019b02db : fffffa80`30754730 00000000`00000001 fffff880`0505d000 00000000`00000000 : nt!PnpStopDeviceSubtree+0x1b

fffffa60`01be79f0 fffff800`019b02db : fffffa80`3074a7b0 fffff800`0197a28a fffff880`006671a0 fffffa60`01be7b48 : nt!PnpStopDeviceSubtree+0x1b

fffffa60`01be7a20 fffff800`019b02db : fffffa80`307307b0 fffffa60`01be7a68 fffff880`00632a18 fffffa60`00000000 : nt!PnpStopDeviceSubtree+0x1b

fffffa60`01be7a50 fffff800`019b02db : fffffa80`3083bbb0 fffff800`01713a0f 01c991d6`8792651b fffffa60`01be7b48 : nt!PnpStopDeviceSubtree+0x1b

fffffa60`01be7a80 fffff800`019b02db : fffffa80`305ea230 fffff880`00632a48 00000000`00000000 fffff880`0505d000 : nt!PnpStopDeviceSubtree+0x1b

fffffa60`01be7ab0 fffff800`019b02db : fffffa80`3024cbb0 fffff880`0505d000 00000000`00000001 fffffa60`01be7b48 : nt!PnpStopDeviceSubtree+0x1b

fffffa60`01be7ae0 fffff800`019bd24e : fffff880`0505d000 00000000`00000000 00000000`00000000 fffff800`017a81e8 : nt!PnpStopDeviceSubtree+0x1b

fffffa60`01be7b10 fffff800`01a49e36 : fffffa80`32672de0 fffffa80`32672de0 fffffa80`32675900 00000000`00000000 : nt!PnpRebalance+0x10e

fffffa60`01be7ba0 fffff800`01a4b85a : fffffa80`34ec7af0 fffffa80`34d031d0 00000000`00000000 00000000`00000001 : nt!PnpReallocateResources+0x186

fffffa60`01be7c30 fffff800`01742bd7 : fffff800`0180e500 fffffa80`34d031d0 fffffa60`0199ccc0 00000000`32706e50 : nt!PiProcessResourceRequirementsChanged+0x7a

fffffa60`01be7c80 fffff800`01677066 : fffff800`017429a0 fffffa80`3028ab01 fffff800`017a78f8 00000000`00000001 : nt!PnpDeviceActionWorker+0x237

fffffa60`01be7cf0 fffff800`0188dde3 : fffff800`0180e5a0 a4805910`8a996d05 fffffa80`3028abb0 00000000`00000080 : nt!ExpWorkerThread+0x11a

fffffa60`01be7d50 fffff800`016a4536 : fffffa60`01999180 fffffa80`3028abb0 fffffa60`019a2d40 00000000`00000001 : nt!PspSystemThreadStartup+0x57

fffffa60`01be7d80 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16

 

 

STACK_COMMAND:  kb

 

FOLLOWUP_IP: 

ibbus!ib_deregister_ca+a1 [c:\work\winof.2.0.2\sources\core\al\kernel\al_mgr.c @ 315]

fffffa60`03e9f9a1 888390000000    mov     byte ptr [rbx+90h],al

 

SYMBOL_STACK_INDEX:  4

 

SYMBOL_NAME:  ibbus!ib_deregister_ca+a1

 

FOLLOWUP_NAME:  MachineOwner

 

MODULE_NAME: ibbus

 

IMAGE_NAME:  ibbus.sys

 

DEBUG_FLR_IMAGE_TIMESTAMP:  499bdbde

 

FAILURE_BUCKET_ID:  X64_0xA_W_ibbus!ib_deregister_ca+a1

 

BUCKET_ID:  X64_0xA_W_ibbus!ib_deregister_ca+a1

 

Followup: MachineOwner

---------

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090219/8ab6fbc0/attachment.html>


More information about the ofw mailing list