[Openib-windows] RE: Some more additions questions about the patch:

Fab Tillier ftillier at silverstorm.com
Mon Sep 19 10:01:12 PDT 2005


> From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
> Sent: Monday, September 19, 2005 12:58 AM
> 
> >-----Original Message-----
> >From: Fab Tillier [mailto:ftillier at silverstorm.com]
> >Sent: Monday, September 19, 2005 7:51 AM
> >To: 'Tzachi Dar'; openib-windows at openib.org
> >Subject: RE: Some more additions questions about the patch:
> >
> >For the kernel, I'm considering going a step further an eliminating the
> >memory tracker - driver verifier has memory leak detection built in, and
> >we might as well just use that instead.  The memory leak detection
> >provided by driver verifier doesn't require any sort of recompile of the
> >drivers, which is an extra benefit.
>
> Does the memory leak detection of the driver verifier also tell where the leak
> was? If not than there is an advantage to our mechanism, if yes than our
> mechanism is not really needed.

It does, but not immediately.  So when the driver unloads, verifier will issue a
bugcheck with bugcheck code 0xC4.  The first parameter is a subcode which tells
you the reason for the bugcheck.  For reasons 0x60 and 0x62 tell you a driver
was unloaded without freeing all its memory.  The DDK docs describe how to get
the detailed memory information.  Here's what it looks like (when putting in
place the patch to use the unload handler to call CL_DEINIG, I left in two calls
to CL_INIT, so the memory tracker was never freed):

*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck C4, {62, 8231de18, 8231ddc0, 20}

Probably caused by : ipoib.sys

Followup: MachineOwner
---------

nt!RtlpBreakWithStatusInstruction:
80871648 cc               int     3
0: kd> dp ViBadDriver L1; dS @$p
808ab138  8231ddd0
8231de18  "ipoib.sys"
0: kd> !verifier 3 ipoib.sys

Verify Level fb ... enabled options are:
	special pool
	special irql
	all pool allocations checked on unload
	Io subsystem checking enabled
	Deadlock detection enabled
	Enhanced Io checking enabled
	DMA checking enabled

Summary of All Verifier Statistics

RaiseIrqls                             0x0
AcquireSpinLocks                       0x955
Synch Executions                       0x0
Trims                                  0x39

Pool Allocations Attempted             0x34
Pool Allocations Succeeded             0x34
Pool Allocations Succeeded SpecialPool 0x34
Pool Allocations With NO TAG           0x0
Pool Allocations Failed                0x0
Resource Allocations Failed Deliberately   0x0

Current paged pool allocations         0x0 for 00000000 bytes
Peak paged pool allocations            0x0 for 00000000 bytes
Current nonpaged pool allocations      0x20 for 00000E40 bytes
Peak nonpaged pool allocations         0x30 for 0044B300 bytes

Driver Verification List

Entry     State           NonPagedPool   PagedPool   Module

8231ddc0 Loaded           00000e40       00000000    ipoib.sys

Current Pool Allocations  00000000    00000020
Current Pool Bytes        00000000    00000e40
Peak Pool Allocations     00000000    00000030
Peak Pool Bytes           00000000    0044b300

PoolAddress  SizeInBytes    Tag       CallersAddress
82638f80     0x00000080     Ddk       b84ff776
82df4f80     0x00000080     Ddk       b84ff776
82612f80     0x00000080     Ddk       b84ff776
82e20f80     0x00000080     Ddk       b84ff776
82de2f80     0x00000080     Ddk       b84ff776
828f2f80     0x00000080     Ddk       b84ff776
82736f80     0x00000080     Ddk       b84ff776
82902f80     0x00000080     Ddk       b84ff776
827e4f80     0x00000080     Ddk       b84ff776
830e2f80     0x00000080     Ddk       b84ff776
82664f80     0x00000080     Ddk       b84ff776
82880f80     0x00000080     Ddk       b84ff776
8250af80     0x00000080     Ddk       b84ff776
82474f80     0x00000080     Ddk       b84ff776
82642f80     0x00000080     Ddk       b84ff776
831eef80     0x00000080     Ddk       b84ff776
831ecff8     0x00000008     Ddk       b84ff776
831eaf80     0x00000080     Ddk       b84ff776
831e8fe0     0x00000020     Ddk       b84ff776
831e6f80     0x00000080     Ddk       b84ff776
831e4fe0     0x00000020     Ddk       b84ff776
831e2f80     0x00000080     Ddk       b84ff776
831e0fe0     0x00000020     Ddk       b84ff776
831dcf80     0x00000080     Ddk       b84ff776
831dafe0     0x00000020     Ddk       b84ff776
831d8f80     0x00000080     Ddk       b84ff776
831d6fa0     0x0000005c     Ddk       b84ff776
831d4f80     0x00000080     Ddk       b84ff776
831d2ff8     0x00000008     Ddk       b84ff776
831d0f80     0x00000080     Ddk       b84ff776
831ceec0     0x0000013c     Ddk       b84ff776
831ccf68     0x00000098     Ddk       b84ff776

Using the address, you can show the disassembly using the 'u' command:

0: kd> u b84ff776
ipoib!__cl_malloc_priv+0x96
[c:\dev\openib\openib\base\core\complib\kernel\cl_memory_osd.c @ 51]:

This actually highlights an issue in complib's memory tracker - it obscures the
actual caller by providing a wrapper to the system allocation calls, so all
allocations point to the internal complib allocator, rather than the caller.

> >> 2)     I have moved the place in which the adapter is being created. It
> >> should better be created when the number of adapters in the list goes
> >> from 0 to 1, This is the case in which there is someone holding the
> >> device, the IPOIB ports are being disabled and re-enabled again.
> >
> >You mean when you register the device with NDIS for IOCTL access, or the
> >IPoIB adapter object?  Can't we register the device from DriverEntry, and
> >deregister it from the unload handler?  The docs for the calls seem to
> >indicate that's the common usage model.  I don't think the system will
> >invoke the unload handler until all file handles have been closed.  This
> >would simplify things a great deal, wouldn't it?
>
> As far as what I saw, DriverUnload will not be called until we close the
> device, so although I agree that this is simpler, it doesn't work. (you can
> play with it of course, maybe you will find something).

Ok, I'll play with this and let you know if I can get it to work.
 
> >> 3)     Please add the case of IRP_MJ_INTERNAL_DEVICE_CONTROL to the
> >> function __ipoib_dispatch. It should be handled just like
> >> IRP_MJ_DEVICE_CONTROL.
> >
> >Why don't kernel drivers use IRP_MJ_DEVICE_CONTROL?  Why use
> >IRP_MJ_INTERNAL_DEVICE_CONTROL?
>
> I'm not sure why IRP_MJ_DEVICE_CONTROL doesn't work for drivers (I believe
> that it should). I'm still trying to understand if I should use
> IoGetDeviceObjectPointer to open the device or ZwOpenFile. This might be the
> cause of the problem (I'm currently using IoGetDeviceObjectPointer).

I think IoGetDeviceObjectPointer is the right one.  Are you building the IRP
yourself, or using IoBuildDeviceIoControlRequest?  I don't think ZwOpenFile will
work, as I haven't seen any docs to describe how to send IOCTL requests to a
file in the kernel.  I think you have to use IoCallDriver on a device object,
which ZwOpenFile doesn't give you.

- Fab




More information about the ofw mailing list