<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
{mso-style-priority:99;
mso-style-link:"Balloon Text Char";
margin:0in;
margin-bottom:.0001pt;
font-size:8.0pt;
font-family:"Tahoma","sans-serif";}
span.BalloonTextChar
{mso-style-name:"Balloon Text Char";
mso-style-priority:99;
mso-style-link:"Balloon Text";
font-family:"Tahoma","sans-serif";}
span.EmailStyle19
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.EmailStyle20
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle21
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle22
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle23
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle24
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle25
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle26
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle27
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle28
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Leonid Keller [mailto:leonid@mellanox.com]
<br>
<b>Sent:</b> Wednesday, January 18, 2012 12:57 AM<br>
<b>To:</b> Smith, Stan; Uri Habusha; Fab Tillier; Hefty, Sean<br>
<b>Cc:</b> ofw_list; Irena Gannon<br>
<b>Subject:</b> RE: Opensm & WinMad: a race, causing BSOD722<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">Thank you, Stan.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">A quick question after a quick look on the patch:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">I understand that with this patch the MAD receiver thread will exit
<span style="background:yellow;mso-highlight:yellow">in some time</span>.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal">The original problem is umad_read()s were repeatedly issued during and after the HCA/port close; infinite for() loop, error handler just continued the loop.<o:p></o:p></p>
<p class="MsoNormal">Eventually opensm exit() clobbered the umad_receiver() thread during process tear-down. Your testing exposed a race condition.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The fix allows 0..1 umad_read()’s and then exits the for() loop which then exits the umad_receiver() function which terminates the CL thread.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">I do not understand why the crash can’t happen during this time.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal">umad_receiver_stop() is called which clears the umad_receiver() for() loop execution condition.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Possible umad_receiver() outcomes:<o:p></o:p></p>
<p class="MsoNormal"> umad_receiver () is not in umad_read(), hence the for() loop check will exit.<o:p></o:p></p>
<p class="MsoNormal"> umad_receiver() is blocked in umad_read() call; when umad_read() returns there will either be a valid MAD or an error (device closed) – either way the for() loop check will exit.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Point is umad_read() will not be called after the HCA/port is closed.<o:p></o:p></p>
<p class="MsoNormal">cl_thread_destroy() was unable to interrupt the umad_read() call or if it did another umad_read() was issued before the thread terminated.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">I’d expect umad_receiver_stop() to wait till umad_receiver_run variable gets a value “OK, I’m done, dude”.
</span><span style="font-family:Wingdings;color:#1F497D">J</span><span style="font-family:Wingdings;color:#1F497D"><br>
<br>
</span><span style="font-family:Wingdings;color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal">Potential deadlock situation as the caller of umad_receiver_stop() eventually closes the HCA/port. In order for umad_receiver_stop() to receive confirmation that umad_receiver() had stopped without first closing the HCA, one MAD would
have to be received in order to exit the umad_receiver() for() loop and notify/release umad_receiver_stop().<o:p></o:p></p>
<p class="MsoNormal">I’d vote for keep-it-simple (KIS).<o:p></o:p></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">BTW, is this problem only for reading MADs ?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal">I don’t know, yet to be discovered.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">Is it possible that one thread is sending a MAD while the main one is closing the port ?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal">One would think the MAD driver (winmad) would disallow closing while a Transmit is in progress.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Stan.<o:p></o:p></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Smith, Stan
<a href="mailto:[mailto:stan.smith@intel.com]">[mailto:stan.smith@intel.com]</a> <br>
<b>Sent:</b> Tuesday, January 17, 2012 10:29 PM<br>
<b>To:</b> Leonid Keller; Uri Habusha; Fab Tillier; Hefty, Sean<br>
<b>Cc:</b> ofw_list; Irena Gannon<br>
<b>Subject:</b> RE: Opensm & WinMad: a race, causing BSOD722<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">Leo,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> Please try the enclosed patch which allows the umad_receiver() thread to exit once umad_receiver_stop() has been called [overall the same effect as cl_thread_destroy()].<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">This is not a fix for winmad() but a way of ’not’ calling umad_read() after umad_receiver_stop() has been called.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Stan.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">--- ulp/opensm/user/libvendor/osm_vendor_ibumad.c Thu Jan 12 15:27:32 2012<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+++ ulp/opensm/user/libvendor/osm_vendor_ibumad.c Tue Jan 17 12:15:11 2012<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">@@ -67,6 +67,10 @@<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">#include <opensm/osm_helper.h><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">#include <vendor/osm_vendor_api.h><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+#ifdef __WIN__<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+static int umad_receiver_run;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+#endif<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">/****s* OpenSM: Vendor UMAD/osm_umad_bind_info_t<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> * NAME<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> * osm_umad_bind_info_t<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">@@ -253,7 +257,11 @@<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> OSM_LOG_ENTER(p_ur->p_log);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+#ifdef __WIN__<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+ for (umad_receiver_run=1; umad_receiver_run;) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+#else<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> for (;;) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+#endif<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> if (!umad &&<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> !(umad = umad_alloc(1, umad_size() + MAD_BLOCK_SIZE))) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> OSM_LOG(p_ur->p_log, OSM_LOG_ERROR, "ERR 5403: "<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">@@ -469,9 +477,7 @@<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> pthread_join(p_ur->tid, NULL);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> p_ur->tid = 0;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">#else<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">- /* XXX hangs current thread - suspect umad_recv() ignoring wakeup.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">- * cl_thread_destroy(&p_ur->tid);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">- */<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">+ umad_receiver_run = 0;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">#endif<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> p_ur->p_vend = NULL;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> p_ur->p_log = NULL;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Leonid Keller
<a href="mailto:[mailto:leonid@mellanox.com]">[mailto:leonid@mellanox.com]</a> <br>
<b>Sent:</b> Tuesday, January 17, 2012 12:57 AM<br>
<b>To:</b> Uri Habusha; Fab Tillier; Smith, Stan; Hefty, Sean<br>
<b>Cc:</b> ofw_list; Irena Gannon<br>
<b>Subject:</b> RE: Opensm & WinMad: a race, causing BSOD722<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">I believe, there are two questions here.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Why Opensm closes the port while the MAD receiving thread is still in work ?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal">Hello Leo. I suspect the above is the root of the problem. A long time ago, opensm 3.3.6, opensm would hang in shutdown due to the MAD receiving thread ignoring the destroy thread call if reader thread was in winmad read().<o:p></o:p></p>
<p class="MsoNormal">Consequently, the MAD receiver thread destroy was commented out and opensm shutdown cleanly.<o:p></o:p></p>
<p class="MsoNormal">Perhaps changes in IBAL have altered timing such that shutdown is delayed enough such that the MAD reader thread runs again/further?<o:p></o:p></p>
<p class="MsoNormal">A SWAG at this point – I’ll experiment today and let you know.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Stan.<o:p></o:p></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal">[LK] Hello Stan. It looks like a Smart WAG. <span style="font-family:Wingdings">
J</span><o:p></o:p></p>
<p class="MsoNormal">But I believe, we all agree with Fab, that the problem should be fixed at WinMad level. We have other WinMad applications except opensm.<o:p></o:p></p>
<p class="MsoNormal">I hope, Sean will have a chance to look at it.<o:p></o:p></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Why Winmad doesn’t expect that WmIoRead and WmProviderCleanup can be called simultaneously from different threads ?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Please, look at the code below.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">WmIoRead gets provider without any lock and without checking whether it is valid.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">It then calls WmProviderRead, which doesn’t increment any reference, but handles MAD under WdfObjectAcquireLock(pProvider->ReadQueue), which can be already destroyed (like it was in our case).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">WmFileCleanup removes provider from the list under a mutex, unknown to WmIoRead, and then calls WmProviderCleanup.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">The latter waits for some reference, unknown for WmIoRead, and then calls WdfIoQueuePurgeSynchronously(pProvider->ReadQueue) without WdfObjectAcquireLock(pProvider->ReadQueue).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">No doubt, I’m missing here something, I just did a quick review of the code.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Maybe WinMad presumes that the application first stops I/O, then sends WM_IOCTL_DEREGISTER and only then closes the file handle and the right way for us is just to fix a bad boy opensm ?
</span><span style="font-family:Wingdings;color:#1F497D">J</span><span style="color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">As Uri stated, we do not know how to reproduce it: opensm gets started and stopped 1000 times during a night regression run, so it’s not 1 to 100, it’s 1 to 100000.
</span><span style="font-family:Wingdings;color:#1F497D">L</span><span style="color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">We’d very appreciate if you could find the time to review the code.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">(Of course, the same problem may exist for WmIoWrite)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">TIA<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">static VOID WmIoRead(WDFQUEUE Queue, WDFREQUEST Request, size_t Length)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">{<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WDFFILEOBJECT file;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WM_PROVIDER *prov;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> UNREFERENCED_PARAMETER(Queue);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> file = WdfRequestGetFileObject(Request);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> prov = WmProviderGetContext(file);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WmProviderRead(prov, Request);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">}<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">void WmProviderRead(WM_PROVIDER *pProvider, WDFREQUEST Request)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">{<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WM_REGISTRATION *reg;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> NTSTATUS status;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WM_IO_MAD *wmad;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> size_t outlen, len = 0;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> status = WdfRequestRetrieveOutputBuffer(Request, sizeof(WM_IO_MAD), &wmad, &outlen);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> if (!NT_SUCCESS(status)) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> goto out;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WdfObjectAcquireLock(pProvider->ReadQueue);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> if (pProvider->MadHead == NULL) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> status = WdfRequestForwardToIoQueue(Request, pProvider->ReadQueue);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WdfObjectReleaseLock(pProvider->ReadQueue);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> if (NT_SUCCESS(status)) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> return;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> goto out;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> len = outlen;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> status = WmCopyMad(wmad, pProvider->MadHead, &len);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> if (status == STATUS_SUCCESS) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> reg = (WM_REGISTRATION *) pProvider->MadHead->context1;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> reg->pDevice->IbInterface.put_mad(__FILE__, __LINE__, WmRemoveMad(pProvider));<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WdfObjectReleaseLock(pProvider->ReadQueue);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">out:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WdfRequestCompleteWithInformation(Request, status, len);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">}<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">static VOID WmFileCleanup(WDFFILEOBJECT FileObject)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">{<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WM_PROVIDER *prov = WmProviderGetContext(FileObject);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> KeAcquireGuardedMutex(&Lock);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> RemoveEntryList(&prov->Entry);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> KeReleaseGuardedMutex(&Lock);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WmProviderCleanup(prov);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">}<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">void WmProviderCleanup(WM_PROVIDER *pProvider)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">{<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WM_REGISTRATION *reg;<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> while ((reg = IndexListRemoveHead(&pProvider->RegIndex)) != NULL) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WmRegFree(reg);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> if (InterlockedDecrement(&pProvider->Ref) > 0) {<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> KeWaitForSingleObject(&pProvider->Event, Executive, KernelMode, FALSE, NULL);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WdfIoQueuePurgeSynchronously(pProvider->ReadQueue);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> WdfObjectDelete(pProvider->ReadQueue);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"> IndexListDestroy(&pProvider->RegIndex);<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">}<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Uri Habusha
<br>
<b>Sent:</b> Tuesday, January 17, 2012 7:16 AM<br>
<b>To:</b> Fab Tillier; Smith, Stan; Leonid Keller; Hefty, Sean<br>
<b>Cc:</b> ofw_list; Irena Gannon<br>
<b>Subject:</b> RE: Opensm & WinMad: a race, causing BSOD722<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">Again you miss the point.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Not always we have a clear repro steps. If we have it the life was simple.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">This issue happened during night run of our regression. The same test runs every night and this is the first time it happens, what means it is race/timing issue.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">In such a case we need to review the code start thinking what can cause the issue and add instrumental code so next time it happens we understand better the scenario.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Uri<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Fab Tillier
<a href="mailto:[mailto:ftillier@microsoft.com]">[mailto:ftillier@microsoft.com]</a>
<br>
<b>Sent:</b> Tuesday, January 17, 2012 2:47 AM<br>
<b>To:</b> Smith, Stan; Leonid Keller; Hefty, Sean<br>
<b>Cc:</b> Uri Habusha; ofw_list; Irena Gannon<br>
<b>Subject:</b> RE: Opensm & WinMad: a race, causing BSOD722<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">10D is the bugcheck code, WDF_VIOLATION. The next 4 parameters are the bugcheck arguments.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Leonid, there’s good detail in this report, but you’re missing a critical ‘repro steps’ section that would allow this to be reproduced and debugged by the bug owner. Just the crashdump information is only usually
enough for simple bugs, and your analysis is probably on the right track. Still, if you want someone else to look into it, it’s best to tell them how they too can see it – debugging things live is more productive than debugging by code inspection. How are
Sean or Stan supposed to test that they have indeed fixed a bug, assuming they find something suspect and make code changes, when they have no idea how you triggered it? Is it a consistent repro? 1 in a 100?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Thanks,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">-Fab<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">
<a href="mailto:ofw-bounces@lists.openfabrics.org">ofw-bounces@lists.openfabrics.org</a>
<a href="mailto:[mailto:ofw-bounces@lists.openfabrics.org]">[mailto:ofw-bounces@lists.openfabrics.org]</a>
<b>On Behalf Of </b>Smith, Stan<br>
<b>Sent:</b> Monday, January 16, 2012 4:37 PM<br>
<b>To:</b> Leonid Keller; Hefty, Sean<br>
<b>Cc:</b> Uri Habusha; ofw_list; Irena Gannon<br>
<b>Subject:</b> Re: [ofw] Opensm & WinMad: a race, causing BSOD722<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">What are the conditions required to generate the BSOD?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">From the terse description it appears you are shutting down OpenSM. How is this accomplished?
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal">Opensm - 10D, {b, 76157f00, 0, 8811d008}. Not clear what this statement implies? Please explain.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="color:#1F497D">Stan.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Leonid Keller
<a href="mailto:[mailto:leonid@mellanox.com]">[mailto:leonid@mellanox.com]</a> <br>
<b>Sent:</b> Sunday, January 15, 2012 3:09 AM<br>
<b>To:</b> Smith, Stan; Hefty, Sean<br>
<b>Cc:</b> Uri Habusha; Irena Gannon; ofw_list<br>
<b>Subject:</b> Opensm & WinMad: a race, cauing BSOD722<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi guys,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">We got a BSOD in Opensm - 10D, {b, 76157f00, 0, 8811d008}.<o:p></o:p></p>
<p class="MsoNormal">Could you take a look ?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><u>My analysis:<o:p></o:p></u></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Seems like BSOD has been caused by a race between the main and MAD reading threads of Opensm.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The main thread has already closed the port and is now found in osm_subn_destroy():<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">opensm_main<o:p></o:p></p>
<p class="MsoNormal"> …<o:p></o:p></p>
<p class="MsoNormal"> osm_mad_pool_destroy(&p_osm->mad_pool);<o:p></o:p></p>
<p class="MsoNormal"> osm_vendor_delete(&p_osm->p_vendor); // port release<o:p></o:p></p>
<p class="MsoNormal"> <span style="background:aqua;mso-highlight:aqua">
osm_subn_destroy(&p_osm->subn);</span> // the thread is found here now<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The reading thread is still in action:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">opensm!umad_receiver<o:p></o:p></p>
<p class="MsoNormal"> libibumad!umad_recv<o:p></o:p></p>
<p class="MsoNormal"> …<o:p></o:p></p>
<p class="MsoNormal"> winmad!WmIoRead<o:p></o:p></p>
<p class="MsoNormal"> winmad!WmProviderRead<o:p></o:p></p>
<p class="MsoNormal" style="margin-left:1.0in;text-indent:.5in"><span style="background:yellow;mso-highlight:yellow">WdfObjectAcquireLock(pProvider->ReadQueue);</span>
<span style="background:yellow;mso-highlight:yellow">// BSOD</span><o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">A try to ReadQueue with !wdfqueue fails.<o:p></o:p></p>
<p class="MsoNormal">Seems like <i>pProvider</i> is already released. But there is no any checks of its validity in WmProviderRead().<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><i>Possible solution:<o:p></o:p></i></p>
<p class="MsoNormal">Maybe WmIoRead() should check, that the Provider is not being released and take some reference, while WmProviderRemoveHandler() should wait to this reference to be removed ?<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><b><u>Technical info:<o:p></o:p></u></b></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">// Bugcheck<o:p></o:p></p>
<p class="MsoNormal">WDF_VIOLATION (10d)<o:p></o:p></p>
<p class="MsoNormal">The Kernel-Mode Driver Framework has detected that Windows detected an error<o:p></o:p></p>
<p class="MsoNormal">in a framework-based driver. In general, the dump file will yield additional<o:p></o:p></p>
<p class="MsoNormal">information about the driver that caused this bug check.<o:p></o:p></p>
<p class="MsoNormal">Arguments:<o:p></o:p></p>
<p class="MsoNormal">Arg1: 0000000b, An attempt to acquire or release a lock was invalid. In this<o:p></o:p></p>
<p class="MsoNormal"> case, Parameter 3 further specifies the error that has been<o:p></o:p></p>
<p class="MsoNormal"> made.<o:p></o:p></p>
<p class="MsoNormal">Arg2: 76157f00, The handle value.<o:p></o:p></p>
<p class="MsoNormal">Arg3: 00000000, A handle passed to either WdfObjectAcquireLock or<o:p></o:p></p>
<p class="MsoNormal"> WdfObjectReleaseLock represents an object that does not<o:p></o:p></p>
<p class="MsoNormal"> support synchronization locks.<o:p></o:p></p>
<p class="MsoNormal">Arg4: 8811d008, Reserved.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">// crash line<o:p></o:p></p>
<p class="MsoNormal">void WmProviderRead(WM_PROVIDER *pProvider, WDFREQUEST Request)<o:p></o:p></p>
<p class="MsoNormal">{<o:p></o:p></p>
<p class="MsoNormal"> WM_REGISTRATION *reg;<o:p></o:p></p>
<p class="MsoNormal"> NTSTATUS status;<o:p></o:p></p>
<p class="MsoNormal"> WM_IO_MAD *wmad;<o:p></o:p></p>
<p class="MsoNormal"> size_t outlen, len = 0;<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> status = WdfRequestRetrieveOutputBuffer(Request, sizeof(WM_IO_MAD), &wmad, &outlen);<o:p></o:p></p>
<p class="MsoNormal"> if (!NT_SUCCESS(status)) {<o:p></o:p></p>
<p class="MsoNormal"> goto out;<o:p></o:p></p>
<p class="MsoNormal"> }<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> <span style="background:yellow;mso-highlight:yellow">
WdfObjectAcquireLock(pProvider->ReadQueue);</span><o:p></o:p></p>
<p class="MsoNormal"> if (pProvider->MadHead == NULL) {<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">// Local parameters<o:p></o:p></p>
<p class="MsoNormal">3: kd> ?? pProvider<o:p></o:p></p>
<p class="MsoNormal">struct _WM_PROVIDER * 0x894761c0<o:p></o:p></p>
<p class="MsoNormal"> +0x000 Entry : _LIST_ENTRY [ 0x8e3ea070 - 0x8e3ea070 ]<o:p></o:p></p>
<p class="MsoNormal"> +0x008 RegIndex : _INDEX_LIST<o:p></o:p></p>
<p class="MsoNormal"> <span style="background:yellow;mso-highlight:yellow">+0x014 ReadQueue : 0x76157f00 WDFQUEUE__</span><o:p></o:p></p>
<p class="MsoNormal"> +0x018 MadHead : (null) <o:p></o:p></p>
<p class="MsoNormal"> +0x01c MadTail : 0xbc67ccc8 _ib_mad_element<o:p></o:p></p>
<p class="MsoNormal"> +0x020 Lock : _KGUARDED_MUTEX<o:p></o:p></p>
<p class="MsoNormal"> +0x040 Ref : 0<o:p></o:p></p>
<p class="MsoNormal"> +0x044 Event : _KEVENT<o:p></o:p></p>
<p class="MsoNormal"> +0x054 Pending : 0<o:p></o:p></p>
<p class="MsoNormal"> +0x058 Active : 0<o:p></o:p></p>
<p class="MsoNormal"> +0x05c SharedEvent : _KEVENT<o:p></o:p></p>
<p class="MsoNormal"> +0x06c Exclusive : 0<o:p></o:p></p>
<p class="MsoNormal"> +0x070 ExclusiveEvent : _KEVENT<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">// ReadQueue<o:p></o:p></p>
<p class="MsoNormal">3: kd> !wdfqueue 0x76157f00<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Dumping WDFQUEUE <span style="background:yellow;mso-highlight:yellow">
0x76157f00</span><o:p></o:p></p>
<p class="MsoNormal">=========================<o:p></o:p></p>
<p class="MsoNormal">Manualcouldn't read 00000044<o:p></o:p></p>
<p class="MsoNormal">, Deleted, Disposing, Not power-managed, PowerOn, Cannot accept, Can dispatch, Dispatching, ExecutionLevelDispatch, SynchronizationScopeNone<o:p></o:p></p>
<p class="MsoNormal"> Number of driver owned requests: 0<o:p></o:p></p>
<p class="MsoNormal"> Number of waiting requests: 0<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">// opensm <o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">3: kd> !PROCESS a14ba6c0<o:p></o:p></p>
<p class="MsoNormal">PROCESS a14ba6c0 SessionId: 1 Cid: 13fc Peb: 7ffd9000 ParentCid: 10fc<o:p></o:p></p>
<p class="MsoNormal"> DirBase: bd5feee0 ObjectTable: bf7fa7b8 HandleCount: 41.<o:p></o:p></p>
<p class="MsoNormal"> Image: opensm.exe<o:p></o:p></p>
<p class="MsoNormal"> VadRoot 89c8ca38 Vads 51 Clone 0 Private 271. Modified 3. Locked 0.<o:p></o:p></p>
<p class="MsoNormal"> DeviceMap 81f41008<o:p></o:p></p>
<p class="MsoNormal"> Token a6e81030<o:p></o:p></p>
<p class="MsoNormal"> ElapsedTime 00:00:00.890<o:p></o:p></p>
<p class="MsoNormal"> UserTime 00:00:00.015<o:p></o:p></p>
<p class="MsoNormal"> KernelTime 00:00:00.062<o:p></o:p></p>
<p class="MsoNormal"> QuotaPoolUsage[PagedPool] 0<o:p></o:p></p>
<p class="MsoNormal"> QuotaPoolUsage[NonPagedPool] 0<o:p></o:p></p>
<p class="MsoNormal"> Working Set Sizes (now,min,max) (913, 50, 345) (3652KB, 200KB, 1380KB)<o:p></o:p></p>
<p class="MsoNormal"> PeakWorkingSetSize 930<o:p></o:p></p>
<p class="MsoNormal"> VirtualSize 76 Mb<o:p></o:p></p>
<p class="MsoNormal"> PeakVirtualSize 78 Mb<o:p></o:p></p>
<p class="MsoNormal"> PageFaultCount 1047<o:p></o:p></p>
<p class="MsoNormal"> MemoryPriority BACKGROUND<o:p></o:p></p>
<p class="MsoNormal"> BasePriority 8<o:p></o:p></p>
<p class="MsoNormal"> CommitCharge 8768<o:p></o:p></p>
<p class="MsoNormal"> Job 89e04368<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> THREAD 89e05d48 Cid 13fc.1758 Teb: 7ffdf000 Win32Thread: fe333b18 RUNNING on processor 2<o:p></o:p></p>
<p class="MsoNormal"> Not impersonating<o:p></o:p></p>
<p class="MsoNormal"> DeviceMap 81f41008<o:p></o:p></p>
<p class="MsoNormal"> Owning Process 0 Image: <Unknown><o:p></o:p></p>
<p class="MsoNormal"> Attached Process a14ba6c0 Image: opensm.exe<o:p></o:p></p>
<p class="MsoNormal"> Wait Start TickCount 5900630 Ticks: 1 (0:00:00:00.015)<o:p></o:p></p>
<p class="MsoNormal"> Context Switch Count 454 <o:p></o:p></p>
<p class="MsoNormal"> UserTime 00:00:00.015<o:p></o:p></p>
<p class="MsoNormal"> KernelTime 00:00:00.578<o:p></o:p></p>
<p class="MsoNormal"> Win32 Start Address opensm!mainCRTStartup (0x0042aaf5)<o:p></o:p></p>
<p class="MsoNormal"> Stack Init adf03fd0 Current adf038d0 Base adf04000 Limit adf01000 Call 0<o:p></o:p></p>
<p class="MsoNormal"> Priority 11 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5<o:p></o:p></p>
<p class="MsoNormal"> ChildEBP RetAddr <o:p></o:p></p>
<p class="MsoNormal"> adf03cec 826be3a4 nt!KdPollBreakIn+0xea<o:p></o:p></p>
<p class="MsoNormal"> adf03cf0 826be381 nt!KdCheckForDebugBreak+0x17 (FPO: [0,0,0])<o:p></o:p></p>
<p class="MsoNormal"> adf03d24 8261f430 nt!KeUpdateRunTime+0x164<o:p></o:p></p>
<p class="MsoNormal"> adf03d24 003fd34a hal!HalpClockInterruptPn+0x158 (FPO: [0,2] TrapFrame @ adf03d34)<o:p></o:p></p>
<p class="MsoNormal"> 000b2918 003d6174 opensm!osm_subn_destroy+0x1da (FPO: [Non-Fpo]) (CONV: stdcall) [a:\builds\9565\branches\mlnx_winof-3_0_0\ulp\opensm\user\opensm\osm_subnet.c @ 501]<o:p></o:p></p>
<p class="MsoNormal"> 000b2924 003ba03e opensm!osm_opensm_destroy+0x104 (FPO: [Non-Fpo]) (CONV: stdcall) [a:\builds\9565\branches\mlnx_winof-3_0_0\ulp\opensm\user\opensm\osm_opensm.c @ 313]<o:p></o:p></p>
<p class="MsoNormal"> 000cfda8 003bab21 opensm!opensm_main+0x1c3e (FPO: [Non-Fpo]) (CONV: stdcall) [a:\builds\9565\branches\mlnx_winof-3_0_0\ulp\opensm\user\opensm\main.c @ 1264]<o:p></o:p></p>
<p class="MsoNormal"> 000cfdf4 0042a9c4 opensm!main+0x191 (FPO: [Non-Fpo]) (CONV: cdecl) [a:\builds\9565\branches\mlnx_winof-3_0_0\ulp\opensm\user\opensm\main.c @ 1305]<o:p></o:p></p>
<p class="MsoNormal"> 000cfe38 771c1194 opensm!__mainCRTStartup+0x102 (FPO: [Non-Fpo]) (CONV: cdecl) [d:\5359\minkernel\crts\crtw32\dllstuff\crtexe.c @ 695]<o:p></o:p></p>
<p class="MsoNormal"> 000cfe44 7753b3f5 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> 000cfe84 7753b3c8 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> 000cfe9c 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> THREAD c8974030 Cid 13fc.16f4 Teb: 7ffd6000 Win32Thread: 00000000 RUNNING on processor 3<o:p></o:p></p>
<p class="MsoNormal"> IRP List:<o:p></o:p></p>
<p class="MsoNormal"> 89e505b0: (0006,0094) Flags: 00060970 Mdl: 00000000<o:p></o:p></p>
<p class="MsoNormal"> Not impersonating<o:p></o:p></p>
<p class="MsoNormal"> DeviceMap 81f41008<o:p></o:p></p>
<p class="MsoNormal"> Owning Process 0 Image: <Unknown><o:p></o:p></p>
<p class="MsoNormal"> Attached Process a14ba6c0 Image: opensm.exe<o:p></o:p></p>
<p class="MsoNormal"> Wait Start TickCount 5900629 Ticks: 2 (0:00:00:00.031)<o:p></o:p></p>
<p class="MsoNormal"> Context Switch Count 24 <o:p></o:p></p>
<p class="MsoNormal"> UserTime 00:00:00.000<o:p></o:p></p>
<p class="MsoNormal"> KernelTime 00:00:00.015<o:p></o:p></p>
<p class="MsoNormal"> Win32 Start Address complib!cl_thread_callback (0x72f409f0)<o:p></o:p></p>
<p class="MsoNormal"> Stack Init adf57fd0 Current adf57b44 Base adf58000 Limit adf55000 Call 0<o:p></o:p></p>
<p class="MsoNormal"> Priority 10 BasePriority 8 PriorityDecrement 2 IoPriority 2 PagePriority 5<o:p></o:p></p>
<p class="MsoNormal"> ChildEBP RetAddr <o:p></o:p></p>
<p class="MsoNormal"> adf576d4 8272fe71 nt!RtlpBreakWithStatusInstruction (FPO: [1,0,0])<o:p></o:p></p>
<p class="MsoNormal"> adf57724 8273096d nt!KiBugCheckDebugBreak+0x1c<o:p></o:p></p>
<p class="MsoNormal"> adf57ae8 8272fd10 nt!KeBugCheck2+0x68b<o:p></o:p></p>
<p class="MsoNormal"> adf57b08 8d956f9e nt!KeBugCheckEx+0x1e<o:p></o:p></p>
<p class="MsoNormal"> adf57b24 8d955829 Wdf01000!FxVerifierBugCheck+0x24 (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> adf57b4c 8e3e6626 Wdf01000!imp_WdfObjectAcquireLock+0x26 (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> adf57b5c 8e3e6a3c winmad!WdfObjectAcquireLock+0x16 (FPO: [Non-Fpo]) (CONV: stdcall) [c:\winddk\7600.16385.1\inc\wdf\kmdf\1.9\wdfsync.h @ 61]<o:p></o:p></p>
<p class="MsoNormal"> adf57b7c 8e3e5cdd winmad!WmProviderRead+0x3c (FPO: [Non-Fpo]) (CONV: stdcall) [a:\builds\9565\branches\mlnx_winof-3_0_0\core\winmad\kernel\wm_provider.c @ 278]<o:p></o:p></p>
<p class="MsoNormal"> adf57b94 8d94502a winmad!WmIoRead+0x2d (FPO: [Non-Fpo]) (CONV: stdcall) [a:\builds\9565\branches\mlnx_winof-3_0_0\core\winmad\kernel\wm_driver.c @ 128]<o:p></o:p></p>
<p class="MsoNormal"> adf57bb0 8d946256 Wdf01000!FxIoQueueIoRead::Invoke+0x2a (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> adf57bd8 8d9489ac Wdf01000!FxIoQueue::DispatchRequestToDriver+0x1a3 (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> adf57bf4 8d949a36 Wdf01000!FxIoQueue::DispatchEvents+0x3be (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> adf57c14 8d94b824 Wdf01000!FxIoQueue::QueueRequest+0x1ec (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> adf57c38 8d93aa3f Wdf01000!FxPkgIo::Dispatch+0x27d (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> adf57c44 8268f4bc Wdf01000!FxDevice::Dispatch+0x7f (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> adf57c5c 82890f6e nt!IofCallDriver+0x63<o:p></o:p></p>
<p class="MsoNormal"> adf57c7c 828a2f32 nt!IopSynchronousServiceTail+0x1f8<o:p></o:p></p>
<p class="MsoNormal"> adf57d08 8269644a nt!NtReadFile+0x644<o:p></o:p></p>
<p class="MsoNormal"> adf57d08 775264f4 nt!KiFastCallEntry+0x12a (FPO: [0,3] TrapFrame @ adf57d34)<o:p></o:p></p>
<p class="MsoNormal"> 007cfb84 7752570c ntdll!KiFastSystemCallRet (FPO: [0,0,0])<o:p></o:p></p>
<p class="MsoNormal"> 007cfb88 757af249 ntdll!ZwReadFile+0xc (FPO: [9,0,0])<o:p></o:p></p>
<p class="MsoNormal"> 007cfbec 771bdafd KERNELBASE!ReadFile+0xaa (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> 007cfc34 72fd1a25 kernel32!ReadFileImplementation+0xf0 (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> 007cfc58 74272910 winmad_72fd0000!CWMProvider::Receive+0xa5 (FPO: [Non-Fpo]) (CONV: stdcall) [a:\builds\9565\branches\mlnx_winof-3_0_0\core\winmad\user\wm_provider.cpp @ 227]<o:p></o:p></p>
<p class="MsoNormal"> 007cfc7c 0042d76a libibumad!umad_recv+0x50 (FPO: [Non-Fpo]) (CONV: stdcall) [a:\builds\9565\branches\mlnx_winof-3_0_0\ulp\libibumad\src\umad.cpp @ 706]<o:p></o:p></p>
<p class="MsoNormal"> 007cfd0c 72f40a04 opensm!umad_receiver+0xba (FPO: [Non-Fpo]) (CONV: stdcall) [a:\builds\9565\branches\mlnx_winof-3_0_0\ulp\opensm\user\libvendor\osm_vendor_ibumad.c @ 266]<o:p></o:p></p>
<p class="MsoNormal"> 007cfd18 771c1194 complib!cl_thread_callback+0x14 (FPO: [Non-Fpo]) (CONV: stdcall) [a:\builds\9565\branches\mlnx_winof-3_0_0\core\complib\user\cl_thread.c @ 49]<o:p></o:p></p>
<p class="MsoNormal"> 007cfd24 7753b3f5 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> 007cfd64 7753b3c8 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo])<o:p></o:p></p>
<p class="MsoNormal"> 007cfd7c 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])<o:p></o:p></p>
</div>
</div>
</div>
</div>
</div>
</body>
</html>