<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:st1="urn:schemas-microsoft-com:office:smarttags" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]--><o:SmartTagType
namespaceuri="urn:schemas-microsoft-com:office:smarttags" name="place"/>
<o:SmartTagType namespaceuri="urn:schemas-microsoft-com:office:smarttags"
name="State"/>
<!--[if !mso]>
<style>
st1\:*{behavior:url(#default#ieooui) }
</style>
<![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
{margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:Arial;
color:navy;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 77.95pt 1.0in 77.95pt;}
div.Section1
{page:Section1;}
/* List Definitions */
@list l0
{mso-list-id:1847397450;
mso-list-type:hybrid;
mso-list-template-ids:1828340426 67698689 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:1.0in;
mso-level-number-position:left;
margin-left:1.0in;
text-indent:-.25in;
font-family:Symbol;}
@list l0:level2
{mso-level-tab-stop:1.0in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level3
{mso-level-tab-stop:1.5in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level4
{mso-level-tab-stop:2.0in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level5
{mso-level-tab-stop:2.5in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level6
{mso-level-tab-stop:3.0in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level7
{mso-level-tab-stop:3.5in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level8
{mso-level-tab-stop:4.0in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level9
{mso-level-tab-stop:4.5in;
mso-level-number-position:left;
text-indent:-.25in;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
-->
</style>
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Also there is an error message from
windbg:<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>[MLX4_HCA] mlnx_query_ca() :***ERROR***
ib_query_device failed (-16)<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>What’s does this mean? I think the
ca handle we opened never got destroyed after we close it.<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Thanks,<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>James<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'><o:p> </o:p></span></font></p>
<div>
<div class=MsoNormal align=center style='text-align:center'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'>
<hr size=2 width="100%" align=center tabindex=-1>
</span></font></div>
<p class=MsoNormal><b><font size=2 face=Tahoma><span style='font-size:10.0pt;
font-family:Tahoma;font-weight:bold'>From:</span></font></b><font size=2
face=Tahoma><span style='font-size:10.0pt;font-family:Tahoma'> ofw-bounces@lists.openfabrics.org
[mailto:ofw-bounces@lists.openfabrics.org] <b><span style='font-weight:bold'>On
Behalf Of </span></b>James Yang<br>
<b><span style='font-weight:bold'>Sent:</span></b> Tuesday, October 14, 2008
3:02 PM<br>
<b><span style='font-weight:bold'>To:</span></b> ofw@lists.openfabrics.org<br>
<b><span style='font-weight:bold'>Subject:</span></b> [ofw] Disconnection
problem and <st1:State w:st="on"><st1:place w:st="on">AL</st1:place></st1:State>
reference</span></font><o:p></o:p></p>
</div>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>Hi,<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>Our driver product is based on WinOF1.1. Recently I saw a problem that
Windows cannot shut down. The procedure and observation are as follows:<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>Install the driver, when there is still some traffic going on, reboot
the system.<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>We do the following in our driver, and everything seems working until
reboot.<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText style='margin-left:1.0in;text-indent:-.25in;mso-list:
l0 level1 lfo2'><![if !supportLists]><font size=2 face=Symbol><span
style='font-size:10.0pt;font-family:Symbol'><span style='mso-list:Ignore'>·<font
size=1 face="Times New Roman"><span style='font:7.0pt "Times New Roman"'>
</span></font></span></span></font><![endif]>create_cq() : one
receive queue and one send queue, and set the callback function<o:p></o:p></p>
<p class=MsoPlainText style='margin-left:.5in'><font size=2 face="Courier New"><span
style='font-size:10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText style='margin-left:1.0in;text-indent:-.25in;mso-list:
l0 level1 lfo2'><![if !supportLists]><font size=2 face=Symbol><span
style='font-size:10.0pt;font-family:Symbol'><span style='mso-list:Ignore'>·<font
size=1 face="Times New Roman"><span style='font:7.0pt "Times New Roman"'>
</span></font></span></span></font><![endif]>create_qp() with the above created
queues, and set init state IB_QPS_INIT<o:p></o:p></p>
<p class=MsoPlainText style='margin-left:.5in'><font size=2 face="Courier New"><span
style='font-size:10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText style='margin-left:1.0in;text-indent:-.25in;mso-list:
l0 level1 lfo2'><![if !supportLists]><font size=2 face=Symbol><span
style='font-size:10.0pt;font-family:Symbol'><span style='mso-list:Ignore'>·<font
size=1 face="Times New Roman"><span style='font:7.0pt "Times New Roman"'>
</span></font></span></span></font><![endif]>cm_req() with the QP and correct
connection path<o:p></o:p></p>
<p class=MsoPlainText style='margin-left:.5in'><font size=2 face="Courier New"><span
style='font-size:10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText style='margin-left:1.0in;text-indent:-.25in;mso-list:
l0 level1 lfo2'><![if !supportLists]><font size=2 face=Symbol><span
style='font-size:10.0pt;font-family:Symbol'><span style='mso-list:Ignore'>·<font
size=1 face="Times New Roman"><span style='font:7.0pt "Times New Roman"'>
</span></font></span></span></font><![endif]>post_recv() with 100 package
buffer for receiving data<o:p></o:p></p>
<p class=MsoPlainText style='margin-left:.5in'><font size=2 face="Courier New"><span
style='font-size:10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText style='margin-left:1.0in;text-indent:-.25in;mso-list:
l0 level1 lfo2'><![if !supportLists]><font size=2 face=Symbol><span
style='font-size:10.0pt;font-family:Symbol'><span style='mso-list:Ignore'>·<font
size=1 face="Times New Roman"><span style='font:7.0pt "Times New Roman"'>
</span></font></span></span></font><![endif]>post_send() when necessary<o:p></o:p></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'> <o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>Receive and send are fine with the respective callback invoked,
whenever there is data activity.<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>At certain point during shutdown, when we try to do cm_dreq() to
initialize a disconnecting, the 100 receiving workitems are never being
released, callback functions are never being called. If we continue to destroy
QP, the final result is IB stack can’t do its clean up work because it
still holds some extra reference counter. Message similar to the following line
shows up in debug version:<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>[<st1:State w:st="on">AL</st1:State>]print_al_obj() !ERROR!: <st1:place
w:st="on"><st1:State w:st="on">AL</st1:State></st1:place> object
fffffadf379c8280(AL_OBJ_TYPE_H_AL),<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>It seems the <st1:place w:st="on"><st1:State w:st="on">AL</st1:State></st1:place>
handle we open can’t be destroyed. But I doubt maybe we already are in a
bad state before that.<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>Winddbg stack, this is on x64 Win2003 server:<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'> fffffadf`2664e880
fffff800`01027682 nt!KiSwapContext+0x85<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'> fffffadf`2664ea00
fffff800`0102828e nt!KiSwapThread+0x3c9<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'> fffffadf`2664ea60
fffffadf`25ac7a3d nt!KeWaitForSingleObject+0x5a6<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'> fffffadf`2664eae0
fffffadf`25b5fca8 ibbus!cl_event_wait_on+0x11d
[c:\windows-openib\src\winib-1176g\core\complib\kernel\cl_event.c @ 59]<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'> fffffadf`2664eb40
fffffadf`25b0013b ibbus!sync_destroy_obj+0x228
[c:\windows-openib\src\winib-1176g\core\al\al_common.c @ 513]<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'> fffffadf`2664ebb0
fffffadf`25a1f8c7 ibbus!ib_close_al+0x3bb
[c:\windows-openib\src\winib-1176g\core\al\al.c @ 89]<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'> fffffadf`2664ec10
fffffadf`25a1b23f MyDriver!IBAccessLayer::Close+0x77 <o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>The al handle ref_cnt is 1 here.<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>Can anyone shed some light on this? Is this a known issue which is
fixed in WinOF2.0 or is it an unknown problem?<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'><o:p> </o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>Thanks,<o:p></o:p></span></font></p>
<p class=MsoPlainText><font size=2 face="Courier New"><span style='font-size:
10.0pt'>James<o:p></o:p></span></font></p>
</div>
</body>
</html>