[Openib-windows] RE: Some more additions questions about the patch:

Tzachi Dar tzachid at mellanox.co.il
Mon Sep 19 13:24:01 PDT 2005



>-----Original Message-----
>From: Fab Tillier [mailto:ftillier at silverstorm.com]
>Sent: Monday, September 19, 2005 8:01 PM
>To: 'Tzachi Dar'; openib-windows at openib.org
>Subject: RE: Some more additions questions about the patch:
>
>> From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
>> Sent: Monday, September 19, 2005 12:58 AM
>>
>> >-----Original Message-----
>> >From: Fab Tillier [mailto:ftillier at silverstorm.com]
>> >Sent: Monday, September 19, 2005 7:51 AM
>> >To: 'Tzachi Dar'; openib-windows at openib.org
>> >Subject: RE: Some more additions questions about the patch:
>> >
>> >For the kernel, I'm considering going a step further an eliminating the
>> >memory tracker - driver verifier has memory leak detection built in, and
>> >we might as well just use that instead.  The memory leak detection
>> >provided by driver verifier doesn't require any sort of recompile of the
>> >drivers, which is an extra benefit.
>>
>> Does the memory leak detection of the driver verifier also tell where the
>leak
>> was? If not than there is an advantage to our mechanism, if yes than our
>> mechanism is not really needed.
>
>It does, but not immediately.  So when the driver unloads, verifier will
>issue a
>bugcheck with bugcheck code 0xC4.  The first parameter is a subcode which
>tells
>you the reason for the bugcheck.  For reasons 0x60 and 0x62 tell you a
>driver
>was unloaded without freeing all its memory.  The DDK docs describe how to
>get
>the detailed memory information.  Here's what it looks like (when putting
>in
>place the patch to use the unload handler to call CL_DEINIG, I left in two
>calls
>to CL_INIT, so the memory tracker was never freed):
>
>***************************************************************************
>****
>*
>*
>*                        Bugcheck Analysis
>*
>*
>*
>***************************************************************************
>****
>
>Use !analyze -v to get detailed debugging information.
>
>BugCheck C4, {62, 8231de18, 8231ddc0, 20}
>
>Probably caused by : ipoib.sys
>
>Followup: MachineOwner
>---------
>
>nt!RtlpBreakWithStatusInstruction:
>80871648 cc               int     3
>0: kd> dp ViBadDriver L1; dS @$p
>808ab138  8231ddd0
>8231de18  "ipoib.sys"
>0: kd> !verifier 3 ipoib.sys
>
>Verify Level fb ... enabled options are:
>	special pool
>	special irql
>	all pool allocations checked on unload
>	Io subsystem checking enabled
>	Deadlock detection enabled
>	Enhanced Io checking enabled
>	DMA checking enabled
>
>Summary of All Verifier Statistics
>
>RaiseIrqls                             0x0
>AcquireSpinLocks                       0x955
>Synch Executions                       0x0
>Trims                                  0x39
>
>Pool Allocations Attempted             0x34
>Pool Allocations Succeeded             0x34
>Pool Allocations Succeeded SpecialPool 0x34
>Pool Allocations With NO TAG           0x0
>Pool Allocations Failed                0x0
>Resource Allocations Failed Deliberately   0x0
>
>Current paged pool allocations         0x0 for 00000000 bytes
>Peak paged pool allocations            0x0 for 00000000 bytes
>Current nonpaged pool allocations      0x20 for 00000E40 bytes
>Peak nonpaged pool allocations         0x30 for 0044B300 bytes
>
>Driver Verification List
>
>Entry     State           NonPagedPool   PagedPool   Module
>
>8231ddc0 Loaded           00000e40       00000000    ipoib.sys
>
>Current Pool Allocations  00000000    00000020
>Current Pool Bytes        00000000    00000e40
>Peak Pool Allocations     00000000    00000030
>Peak Pool Bytes           00000000    0044b300
>
>PoolAddress  SizeInBytes    Tag       CallersAddress
>82638f80     0x00000080     Ddk       b84ff776
>82df4f80     0x00000080     Ddk       b84ff776
>82612f80     0x00000080     Ddk       b84ff776
>82e20f80     0x00000080     Ddk       b84ff776
>82de2f80     0x00000080     Ddk       b84ff776
>828f2f80     0x00000080     Ddk       b84ff776
>82736f80     0x00000080     Ddk       b84ff776
>82902f80     0x00000080     Ddk       b84ff776
>827e4f80     0x00000080     Ddk       b84ff776
>830e2f80     0x00000080     Ddk       b84ff776
>82664f80     0x00000080     Ddk       b84ff776
>82880f80     0x00000080     Ddk       b84ff776
>8250af80     0x00000080     Ddk       b84ff776
>82474f80     0x00000080     Ddk       b84ff776
>82642f80     0x00000080     Ddk       b84ff776
>831eef80     0x00000080     Ddk       b84ff776
>831ecff8     0x00000008     Ddk       b84ff776
>831eaf80     0x00000080     Ddk       b84ff776
>831e8fe0     0x00000020     Ddk       b84ff776
>831e6f80     0x00000080     Ddk       b84ff776
>831e4fe0     0x00000020     Ddk       b84ff776
>831e2f80     0x00000080     Ddk       b84ff776
>831e0fe0     0x00000020     Ddk       b84ff776
>831dcf80     0x00000080     Ddk       b84ff776
>831dafe0     0x00000020     Ddk       b84ff776
>831d8f80     0x00000080     Ddk       b84ff776
>831d6fa0     0x0000005c     Ddk       b84ff776
>831d4f80     0x00000080     Ddk       b84ff776
>831d2ff8     0x00000008     Ddk       b84ff776
>831d0f80     0x00000080     Ddk       b84ff776
>831ceec0     0x0000013c     Ddk       b84ff776
>831ccf68     0x00000098     Ddk       b84ff776
>
>Using the address, you can show the disassembly using the 'u' command:
>
>0: kd> u b84ff776
>ipoib!__cl_malloc_priv+0x96
>[c:\dev\openib\openib\base\core\complib\kernel\cl_memory_osd.c @ 51]:
>
>This actually highlights an issue in complib's memory tracker - it obscures
>the
>actual caller by providing a wrapper to the system allocation calls, so all
>allocations point to the internal complib allocator, rather than the
>caller.
OK, so I guess this can be safely removed.
>
>> >> 2)     I have moved the place in which the adapter is being created.
>It
>> >> should better be created when the number of adapters in the list goes
>> >> from 0 to 1, This is the case in which there is someone holding the
>> >> device, the IPOIB ports are being disabled and re-enabled again.
>> >
>> >You mean when you register the device with NDIS for IOCTL access, or the
>> >IPoIB adapter object?  Can't we register the device from DriverEntry,
>and
>> >deregister it from the unload handler?  The docs for the calls seem to
>> >indicate that's the common usage model.  I don't think the system will
>> >invoke the unload handler until all file handles have been closed.  This
>> >would simplify things a great deal, wouldn't it?
>>
>> As far as what I saw, DriverUnload will not be called until we close the
>> device, so although I agree that this is simpler, it doesn't work. (you
>can
>> play with it of course, maybe you will find something).
>
>Ok, I'll play with this and let you know if I can get it to work.
Good luck.
>
>> >> 3)     Please add the case of IRP_MJ_INTERNAL_DEVICE_CONTROL to the
>> >> function __ipoib_dispatch. It should be handled just like
>> >> IRP_MJ_DEVICE_CONTROL.
>> >
>> >Why don't kernel drivers use IRP_MJ_DEVICE_CONTROL?  Why use
>> >IRP_MJ_INTERNAL_DEVICE_CONTROL?
>>
>> I'm not sure why IRP_MJ_DEVICE_CONTROL doesn't work for drivers (I
>believe
>> that it should). I'm still trying to understand if I should use
>> IoGetDeviceObjectPointer to open the device or ZwOpenFile. This might be
>the
>> cause of the problem (I'm currently using IoGetDeviceObjectPointer).
>
>I think IoGetDeviceObjectPointer is the right one.  Are you building the
>IRP
>yourself, or using IoBuildDeviceIoControlRequest?  I don't think ZwOpenFile
>will
>work, as I haven't seen any docs to describe how to send IOCTL requests to
>a
>file in the kernel.  I think you have to use IoCallDriver on a device
>object,
>which ZwOpenFile doesn't give you.
>
I have made some more searched and I have found out an example in the DDK
that says you should use ZwOpenFile.
Look in the file WINDDK\3790.1830\src\network\ndis\ndiswdm\init.c
It also shows how to get the device object in order to send IOCTLS.
So it seems to me that this problem is over, although I still have to find
how to close the device when IPOIB is trying to get down. I have some ideas
and I'll try them tomorrow, if I have time.

>- Fab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20050919/fb0a4cd2/attachment.html>


More information about the ofw mailing list