[ofw] RE: ibbus - Control Device Object - bugzilla #1367

Leonid Keller leonid at mellanox.co.il
Thu Apr 23 01:31:06 PDT 2009


Thank you a lot.
I'll test the patches.
Some questions.
I understood that you are closing IBAL on power down like low-level
driver does with HCA.
Are you opening it up back on power up ?
Have you tried the patched driver in standby/hibernate scenarios ?
It is a must for WHQL.
We have now problems with some clients which are performing WHQL system
tests on our December-released version.
Especially in Common_Scenario_Stress and Disable_Enable tests which
perform disable/enable and power down/up sequences.
Have you ever run them on the driver ?
While WHQL-ing they used to be launched from DTM, but they are also
found in WDK - \WinDDK\6001.18001\tools\WDTF\amd64fre\SampleScripts -
and can be run manually. The instructions how to run are found in the
script itself. 
Shortly, one has first to install WDTF:
   \WinDDK\6001.18001\tools\WDTF\amd64fre\InstallWDTF.cmd
and then to invoke  
   cscript Common_Scenario_Stress_With_IO.wsf
or
   cscript Disable_Enable_With_IO.wsf 

It would be great if you could test the patches in some ways of the
above.
 
> -----Original Message-----
> From: Smith, Stan [mailto:stan.smith at intel.com] 
> Sent: Monday, April 20, 2009 10:41 PM
> To: Leonid Keller
> Cc: ofw at lists.openfabrics.org
> Subject: ibbus - Control Device Object - bugzilla #1367
> 
> 
> Hello,
>   Please review/test-drive these files and see how they work for you.
> My testing shows the Control Device Object implementation 
> works well over multiple enable/disable cycles (WSD needs to 
> be removed 1st in order to prevent mandatory reboot).
> Additionally System shutdown now removes the HCA devices and 
> then shuts down IBAL so IBAL-async threads no longer attempt 
> to send/forward MADs on a shutdown HCA.
> 
> The only problem I find is the ordering of enabling devices 
> exposes what may be a bug in AL MAD pool cleanup.
> HCAs are numbered 1 & 2, HCA 1 is loaded/enabled 1st, then #2.
> If you disable #2, then #1 and then enable #2, the AL MAD 
> layer blows up; corrupted LookAsideList, see enclosed file fail.txt.
> 
> If you disable HCA #2, #1, then enable #1, #2, everything 
> works as expected.
> Failure currently under investigation.
> 
> The CDO code is what you forwarded to me last week with the 
> additions of global dos_name & dev_nam plus removing HCAs on 
> system shutdown; bus_driver.c & bus_pnp.c handle this.
> 
> The bus_port_mgr.c changes are white-space alignment and the 
> use of BUS_TRACE instead of BUS_PRINT.
> 
> Thanks,
> 
> Stan.
> 
> 
> 
> 
> 



More information about the ofw mailing list