[ofw] crash on IBBUS disabling while mad traffic
Leonid Keller
leonid at mellanox.co.il
Sat Apr 25 15:04:58 PDT 2009
I've got a crash while running WHQL Disable Enable test while opensm was
running on another node.
I was running a December version of the driver, but i'm not sure this
will work with current one. (i'll try)
The test, which makes disable/enable to all devices, passes without
opensm.
With opensm IBBUS sends SA requests to opensm.
In this case __process_sweep() fails, because per-port IOC PnP agent
seems to be already released.
The latter is strange, because __ioc_query_sa takes reference on PnP
agent before sending request.
__ioc_query_sa
__node_rec_cb
__process_query
__process_sweep
Any ideas ?
3: kd> !analyze -v
ERROR: FindPlugIns 8007007b
************************************************************************
*******
*
*
* Bugcheck Analysis
*
*
*
************************************************************************
*******
DRIVER_PAGE_FAULT_IN_FREED_SPECIAL_POOL (d5)
Memory was referenced after it was freed.
This cannot be protected by try-except.
When possible, the guilty driver's name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: fffff98005b72f84, memory referenced
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation
Arg3: fffffa600400b1d0, if non-zero, the address which referenced
memory.
Arg4: 0000000000000000, (reserved)
Debugging Details:
------------------
Matched: ibbus!proxy_ioctl+0x41 (fffffa60`04031d8d)
Matched: ibbus!proxy_ioctl+0xa5 (fffffa60`04031df1)
READ_ADDRESS: fffff98005b72f84 Special pool
FAULTING_IP:
ibbus!__process_sweep+44
[s:\builds\3609\branches\mlnx_winof_2-0\core\al\kernel\al_ioc_pnp.c @
2315]
fffffa60`0400b1d0 83b8d400000003 cmp dword ptr [rax+0D4h],3
MM_INTERNAL_CODE: 0
IMAGE_NAME: ibbus.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 49401b3e
MODULE_NAME: ibbus
FAULTING_MODULE: fffffa6004002000 ibbus
DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT
BUGCHECK_STR: 0xD5
PROCESS_NAME: System
CURRENT_IRQL: f
TRAP_FRAME: fffffa6003d50b00 -- (.trap 0xfffffa6003d50b00)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=fffff98005b72eb0 rbx=0000000000000000 rcx=fffffa6004057780
rdx=fffffa6004005e97 rsi=fffffa600199ccc0 rdi=fffff80001cc0304
rip=fffffa600400b1d0 rsp=fffffa6003d50c90 rbp=0000000000000080
r8=0000000000000005 r9=fffffa6004005e97 r10=0000000000000001
r11=fffffa6003d50c50 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl zr na po nc
ibbus!__process_sweep+0x44:
fffffa60`0400b1d0 83b8d400000003 cmp dword ptr [rax+0D4h],3
ds:fffff980`05b72f84=????????
Resetting default scope
LAST_CONTROL_TRANSFER: from fffff80001969c42 to fffff800018b0b30
STACK_TEXT:
fffffa60`03d502f8 fffff800`01969c42 : fffffa80`0e0eb290
fffff800`0194893d fffff800`01a55140 00000000`00001000 :
nt!RtlpBreakWithStatusInstruction
fffffa60`03d50300 fffff800`0196adb7 : fffff800`00000004
fffff800`01a55140 ffffffff`fffff000 00000000`00000050 :
nt!KiBugCheckDebugBreak+0x12
fffffa60`03d50360 fffff800`018b6754 : fffffa80`0dd77480
fffff800`01cc2bb9 00000000`00000000 fffff800`0194c13f :
nt!KeBugCheck2+0xaa7
fffffa60`03d509d0 fffff800`018c5671 : 00000000`00000050
fffff980`05b72f84 00000000`00000000 fffffa60`03d50b00 :
nt!KeBugCheckEx+0x104
fffffa60`03d50a10 fffff800`018b51d9 : 00000000`00000000
fffff980`0427cf78 fffffa80`0e0ecf00 fffff980`1c27ef40 :
nt!MmAccessFault+0x1371
fffffa60`03d50b00 fffffa60`0400b1d0 : fffff980`1c27ef40
fffff980`04318e00 fffffa60`04005eba fffff980`04318e78 :
nt!KiPageFault+0x119
fffffa60`03d50c90 fffffa60`04005e9d : fffff980`04318e98
fffff980`043bccb0 fffff980`1b88afd0 fffff980`04318e78 :
ibbus!__process_sweep+0x44
[s:\builds\3609\branches\mlnx_winof_2-0\core\al\kernel\al_ioc_pnp.c @
2315]
fffffa60`03d50cc0 fffffa60`040070d9 : fffff980`04318d60
fffff980`0434afd0 00000000`00000000 fffffa60`0400743c :
ibbus!__cl_async_proc_worker+0x61
[s:\builds\3609\branches\mlnx_winof_2-0\core\complib\cl_async_proc.c @
153]
fffffa60`03d50cf0 fffffa60`04007464 : fffff980`0434afd0
00000000`00000080 fffff980`0434afd0 8b8b8b8b`8b8b8b8b :
ibbus!__cl_thread_pool_routine+0x41
[s:\builds\3609\branches\mlnx_winof_2-0\core\complib\cl_threadpool.c @
66]
fffffa60`03d50d20 fffff800`01adafd3 : 8b8b8b8b`8b8b8b8b
8b8b8b8b`8b8b8b8b 8b8b8b8b`8b8b8b8b 8b8b8b8b`8b8b8b01 :
ibbus!__thread_callback+0x28
[s:\builds\3609\branches\mlnx_winof_2-0\core\complib\kernel\cl_thread.c
@ 49]
fffffa60`03d50d50 fffff800`018f0816 : fffffa60`01999180
fffffa80`0e0eb290 fffffa60`019a2d40 00000000`00000001 :
nt!PspSystemThreadStartup+0x57
fffffa60`03d50d80 00000000`00000000 : 00000000`00000000
00000000`00000000 00000000`00000000 00000000`00000000 :
nt!KiStartSystemThread+0x16
STACK_COMMAND: kb
FOLLOWUP_IP:
ibbus!__process_sweep+44
[s:\builds\3609\branches\mlnx_winof_2-0\core\al\kernel\al_ioc_pnp.c @
2315]
fffffa60`0400b1d0 83b8d400000003 cmp dword ptr [rax+0D4h],3
FAULTING_SOURCE_CODE:
2311:
2312: p_results = PARENT_STRUCT( p_async_item, ioc_sweep_results_t,
async_item );
2313: CL_ASSERT( !p_results->p_svc->query_cnt );
2314:
> 2315: if( p_results->p_svc->obj.state == CL_DESTROYING )
2316: {
2317: __put_iou_map( gp_ioc_pnp, &p_results->iou_map );
2318: goto err;
2319: }
2320:
SYMBOL_STACK_INDEX: 6
SYMBOL_NAME: ibbus!__process_sweep+44
FOLLOWUP_NAME: MachineOwner
FAILURE_BUCKET_ID: X64_0xD5_VRF_ibbus!__process_sweep+44
BUCKET_ID: X64_0xD5_VRF_ibbus!__process_sweep+44
Followup: MachineOwner
---------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090426/3917e326/attachment.html>
More information about the ofw
mailing list