[ofw][patch][IBBUS] Fix for a bug from Bugzilla 1367

Smith, Stan stan.smith at intel.com
Wed Apr 15 17:39:42 PDT 2009


Hi Leo,
  Here's some work-in-progress starting with your ControlDeviceObject changes plus additions to manage the dos_name symbolic link to the CDO. The symlink is created in fdo_start() and deleted in fdo_release_resources() when the last HCA is removed.
IBAL startup is the same (in fdo_start) and stopped basically the same in fdo_release_resources.
The usage of gh_al as an indicator if IBAL has been initialized turned out to be flawed, it's never reset to NULL so the enable device start after a device disable was never re-initializing IBAL; now using gp_async_proc_mgr which is reset to NULL in al_cleanup() calls.

On Server 2003/x86 with InfiniHost HCAs (x2) the enable/disable cycles work reliably with vstat getting the expected error when no HCAs are enabled.

The only problem I'm seeing is that of the system shutdown bug described in previous email where the IBAL async threads attempt to send on a powered down HCA.

stan.

________________________________
From: Leonid Keller [mailto:leonid at mellanox.co.il]
Sent: Saturday, April 11, 2009 2:41 PM
To: Smith, Stan
Cc: ofw at lists.openfabrics.org
Subject: RE: [ofw][patch][IBBUS] Fix for a bug from Bugzilla 1367

This solution is based on the assumption, that applications work with IBAL and not with concrete HCAs.
The Control Device has to convey an IOCTL to IBAL and it will find the way to pass it down to the appropriate device.
I tried VSTAT with one and two cards, it worked correctly.
Could you bring an example of IOCTL, that breaks this assumption ?


________________________________
From: Smith, Stan [mailto:stan.smith at intel.com]
Sent: Tuesday, April 07, 2009 7:19 PM
To: Leonid Keller
Cc: ofw at lists.openfabrics.org
Subject: RE: [ofw][patch][IBBUS] Fix for a bug from Bugzilla 1367

Hi Leo,
  This patch does not apply, per chance could you send the patched files: bus_driver.c & bus_pnp.c  so I can continue testing?

I'm also concerned about tracking of a 'valid' PDO in ioctl handling.
The CDO gets
+    cl_init_pnp_po_ext( g_ControlDeviceObject, NULL,
+     NULL, bus_globals.dbg_lvl, NULL, NULL );
How can the follow bus_driver.c bus_ioctl.c code function without a valid PDO?
My experiments showed vstat quit working unless the CDO 'always' contained a valid p_next_do?
Which exposed the issue of when removing and HCA, one needed to make sure the HCA/PDO being removed was not the one used by the CDO. If so, then the CDO.p_next_do needed to be refreshed with a 'valid' PDO.
If the HCA being removed is the 'last' HCA, then vstat should indeed fail.

 /* Only pass down if not handled and not PDO device. */
 if( status == STATUS_INVALID_DEVICE_REQUEST && p_ext->cl_ext.p_next_do )
 {
  IoSkipCurrentIrpStackLocation( p_irp );
  status = IoCallDriver( p_ext->cl_ext.p_next_do, p_irp );
 }


thanks,

Stan.

________________________________
From: Leonid Keller [mailto:leonid at mellanox.co.il]
Sent: Sunday, April 05, 2009 6:47 AM
To: Smith, Stan
Cc: ofw at lists.openfabrics.org
Subject: [ofw][patch][IBBUS] Fix for a bug from Bugzilla 1367

Bugzilla 1367: If N cards are installed on a machine, a disabling of the first one will
prevent the work of all applications with the rest cards.

Explanation:
   IBBUS driver creates only first instance of IBAL with a name. All the rest
are unnamed. After removing the first card, the named instance is gone and
applications, which are binding to IBAL by name, can't find it and can't work.

Solution: The fix creates IBBUS Control Device of "\\DosDevices\\Global\\ibal<file://\\DosDevices\\Global\\ibal>" name from IBBUS Driver Entry.
All other devices are created unnamed.



Index: bus/kernel/bus_driver.c
===================================================================
--- bus/kernel/bus_driver.c (revision 2076)
+++ bus/kernel/bus_driver.c (working copy)
@@ -60,6 +60,7 @@


 char node_desc[IB_NODE_DESCRIPTION_SIZE];
+PDEVICE_OBJECT g_ControlDeviceObject=NULL;

 bus_globals_t bus_globals = {
  BUS_DBG_ERROR,
@@ -981,6 +982,11 @@

  RtlInitUnicodeString( &dos_name, L"\\DosDevices\\Global\\ibal<file://\\DosDevices\\Global\\ibal>" );
  IoDeleteSymbolicLink( &dos_name );
+ if (g_ControlDeviceObject)
+ {
+  IoDeleteDevice(g_ControlDeviceObject);
+  g_ControlDeviceObject = NULL;
+ }
  cur_conf = bus_globals.p_pkey_conf;
  while(cur_conf)
  {
@@ -1055,7 +1061,46 @@
  g_bfi_InstanceCount = 0;
  memset(  __out_bcount(sizeof(g_bus_filters)) (void*)g_bus_filters, 0,
    sizeof(g_bus_filters) );
+
+ {
+  UNICODE_STRING dev_name, dos_name;
+  bus_fdo_ext_t *p_ext=NULL;

+  RtlInitUnicodeString( &dev_name, AL_DEVICE_NAME );
+  RtlInitUnicodeString( &dos_name, L"\\DosDevices\\Global\\ibal<file://\\DosDevices\\Global\\ibal>" );
+
+  status = IoCreateDevice( p_driver_obj, sizeof(bus_fdo_ext_t),
+   &dev_name, FILE_DEVICE_BUS_EXTENDER,
+   FILE_DEVICE_SECURE_OPEN, FALSE, &g_ControlDeviceObject );
+  if( !NT_SUCCESS(status) )
+  {
+   g_ControlDeviceObject = NULL;
+   BUS_PRINT( BUS_DBG_ERROR,
+    ("Failed to create ControlDeviceObject, status %x.\n",status) );
+  }
+  else {
+   IoDeleteSymbolicLink( &dos_name );
+   status = IoCreateSymbolicLink( &dos_name, &dev_name );
+   if( !NT_SUCCESS(status) )
+   {
+    IoDeleteDevice( g_ControlDeviceObject );
+    g_ControlDeviceObject = NULL;
+    BUS_PRINT( BUS_DBG_ERROR,
+     ("Failed to create symlink for dos name.\n") );
+   }
+   else {
+    BUS_PRINT( BUS_DBG_ERROR,
+     ("SUCCEEDED to create symlink for ControlDeviceObject.\n") );
+
+    p_ext = g_ControlDeviceObject->DeviceExtension;
+    RtlZeroMemory(p_ext, sizeof *p_ext);
+    cl_init_pnp_po_ext( g_ControlDeviceObject, NULL,
+     NULL, bus_globals.dbg_lvl, NULL, NULL );
+    IoInitializeRemoveLock( &p_ext->cl_ext.stop_lock, 'dtci', 0, 1000 );
+   }
+  }
+ }
+
  BUS_EXIT( BUS_DBG_DRV );
  return STATUS_SUCCESS;
 }
Index: bus/kernel/bus_pnp.c
===================================================================
--- bus/kernel/bus_pnp.c (revision 2077)
+++ bus/kernel/bus_pnp.c (working copy)
@@ -60,6 +60,7 @@
 bus_filter_t   g_bus_filters[MAX_BUS_FILTERS];

 extern ib_al_handle_t gh_al; // NULL if AL needs init.
+extern PDEVICE_OBJECT g_ControlDeviceObject;


 static NTSTATUS
@@ -197,7 +198,7 @@
  /* Create the FDO device object to attach to the stack. */

  /* if 1st Bus Filter Instance, then create device names for user ioctl */
- if ( ic == 1 )
+ if ( ic == 1  && !g_ControlDeviceObject)
  {
   RtlInitUnicodeString( &dev_name, AL_DEVICE_NAME );
   RtlInitUnicodeString( &dos_name, L"\\DosDevices\\Global\\ibal<file://\\DosDevices\\Global\\ibal>" );
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090415/bee10465/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bus_port_mgr.c
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090415/bee10465/attachment.c>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bus_driver.c
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090415/bee10465/attachment-0001.c>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bus_iou_mgr.c
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090415/bee10465/attachment-0002.c>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: bus_pnp.c
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20090415/bee10465/attachment-0003.c>


More information about the ofw mailing list