<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3243" name=GENERATOR></HEAD>
<BODY>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009><STRONG>Crash:</STRONG></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>
BugCheck 18, {bad0b0b0, fffffa800a3f7a90, 2,
ffffffffffffffff}</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>The reference count
of an object is illegal for the current state of the object.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009> </DIV></SPAN></FONT>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009><STRONG>Setup:</STRONG> </SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>
Two HCAs, IB <STRONG>full </STRONG>stack + the patch, removing the registration
HCA with IBAL.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><U>The problem
doesn't happen without WinVerbs and WinMad.</U></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009><STRONG>Reproduce:</STRONG></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>
1. Disable/enable HCA0.
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>[
2. Disable/enable HCA0. ]</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT></SPAN></FONT></SPAN></FONT><FONT
face=Arial size=2><SPAN class=030050914-05032009><FONT face=Arial size=2><SPAN
class=030050914-05032009> 3. Disable/enable
HCA1.</SPAN></FONT></DIV></DIV></SPAN></FONT><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><STRONG>Quick
Analysis:</STRONG></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>0: kd> !analyze
-v</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>REFERENCE_BY_POINTER
(18)<BR>Arguments:<BR>Arg1: 00000000bad0b0b0, Object type of the object whose
reference count is being lowered<BR>Arg2: fffffa800a3f7a90, Object whose
reference count is being lowered<BR>Arg3: 0000000000000002, Reserved<BR>Arg4:
ffffffffffffffff, Reserved<BR></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><STRONG>The
ill-dereferenced object in question is
IBBUS.SYS</STRONG></SPAN></FONT></DIV></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV></DIV></SPAN></FONT>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>0: kd> !devobj
fffffa800a3f7a90<BR>Device object (fffffa800a3f7a90) is for:<BR>
<STRONG>\Driver\ibbus</STRONG> DriverObject
fffffa800a3f65d0<BR></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>The wrong reference
is PointerCount</DIV></SPAN></FONT>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><SPAN
class=030050914-05032009>0: kd> !object
fffffa800a3f7a90<BR></SPAN></SPAN></FONT><FONT face=Arial size=2><SPAN
class=030050914-05032009>Object: fffffa800a3f7a90 Type: (bad0b0b0)
<BR> ObjectHeader: fffffa800a3f7a60 (old
version)<BR> HandleCount: 0 PointerCount:
<STRONG>4294967295 /*</STRONG> it's -1<STRONG>
*/</STRONG><BR> Directory Object: fffffa800a4ab740 Name:
<BR></DIV></SPAN></FONT>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><STRONG>More
analysis:</STRONG></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>I've got a feeling,
that one of the WinVerbs&WinMad references wrong IBBUS. My guess, it is
WinMad.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>Do the
following.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>Reload the machine
(with two cards), enter the debugger and look to the device
stacks:</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009><U>HCA0:</U></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>3: kd> !devstack
0xfffffa800a2cc060<BR> !DevObj
!DrvObj
!DevExt ObjectName<BR> fffffa800a3ede20
\Driver\WinMad fffffa800a3ec390 <BR>
fffffa800a3ebc70 \Driver\WinVerbs fffffa800a3eadb0
<BR> fffffa800a3ea040 \Driver\ibbus
fffffa800a3ea190 <BR> fffffa800a3e9460
\Driver\mlx4_hca fffffa800a3e95b0 <BR>>
fffffa800a2cc060 \Driver\mlx4_bus fffffa800a2caec0
00000055<BR></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><STRONG>Look at
PointerCount of IBBUS0 - it is 2.</STRONG></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><SPAN
class=030050914-05032009></SPAN></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><SPAN
class=030050914-05032009>3: kd> !object fffffa800a3ea040
</SPAN></DIV></SPAN></FONT>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>Object:
fffffa800a3ea040 Type: (fffffa8006a22840) Device<BR>
ObjectHeader: fffffa800a3ea010 (old version)<BR> HandleCount:
0 <STRONG>PointerCount: 2</STRONG></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009><STRONG></STRONG></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><STRONG>Now,
PointerCount of IBBUS1 (IBBUS for HCA1) is 4.</STRONG></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009><U>HCA1:</U></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>3: kd> !devstack
0xfffffa8008b2b950<BR> !DevObj
!DrvObj
!DevExt ObjectName<BR> fffffa800a3e9e20
\Driver\WinMad fffffa800a3e5390 <BR>
fffffa800a3df800 \Driver\WinVerbs fffffa800a3e7570
<BR> fffffa800a3e3600 \Driver\ibbus
fffffa800a3e3750 ibal<BR> fffffa800a3e4040
\Driver\mlx4_hca fffffa800a3e4190 <BR>>
fffffa8008b2b950 \Driver\mlx4_bus fffffa8008b2b390
00000054<BR></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>3: kd> !object
fffffa800a3e3600<BR>Object: fffffa800a3e3600 Type: (fffffa8006a22840)
Device<BR> ObjectHeader: fffffa800a3e35d0 (old
version)<BR> HandleCount: 0 <STRONG>PointerCount:
4</STRONG></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009><STRONG></STRONG></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009><STRONG>What happens
during the reproducing of the crash ?</STRONG></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>When you disable
HCA0, it decrements IBBUS1' PointerCount to 3.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>When you then
disable HCA1, IBBUS1' PointerCount becomes -1 and you get bugcheck
0x0018.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=030050914-05032009></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>I didn't have time
to continue the investigation.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN class=030050914-05032009>Maybe you can look
into it ?</DIV></SPAN></FONT></BODY></HTML>