[Openib-windows] RE: Some more additions questions about the patch:
Tzachi Dar
tzachid at mellanox.co.il
Mon Sep 19 13:24:01 PDT 2005
>-----Original Message-----
>From: Fab Tillier [mailto:ftillier at silverstorm.com]
>Sent: Monday, September 19, 2005 8:01 PM
>To: 'Tzachi Dar'; openib-windows at openib.org
>Subject: RE: Some more additions questions about the patch:
>
>> From: Tzachi Dar [mailto:tzachid at mellanox.co.il]
>> Sent: Monday, September 19, 2005 12:58 AM
>>
>> >-----Original Message-----
>> >From: Fab Tillier [mailto:ftillier at silverstorm.com]
>> >Sent: Monday, September 19, 2005 7:51 AM
>> >To: 'Tzachi Dar'; openib-windows at openib.org
>> >Subject: RE: Some more additions questions about the patch:
>> >
>> >For the kernel, I'm considering going a step further an eliminating the
>> >memory tracker - driver verifier has memory leak detection built in, and
>> >we might as well just use that instead. The memory leak detection
>> >provided by driver verifier doesn't require any sort of recompile of the
>> >drivers, which is an extra benefit.
>>
>> Does the memory leak detection of the driver verifier also tell where the
>leak
>> was? If not than there is an advantage to our mechanism, if yes than our
>> mechanism is not really needed.
>
>It does, but not immediately. So when the driver unloads, verifier will
>issue a
>bugcheck with bugcheck code 0xC4. The first parameter is a subcode which
>tells
>you the reason for the bugcheck. For reasons 0x60 and 0x62 tell you a
>driver
>was unloaded without freeing all its memory. The DDK docs describe how to
>get
>the detailed memory information. Here's what it looks like (when putting
>in
>place the patch to use the unload handler to call CL_DEINIG, I left in two
>calls
>to CL_INIT, so the memory tracker was never freed):
>
>***************************************************************************
>****
>*
>*
>* Bugcheck Analysis
>*
>*
>*
>***************************************************************************
>****
>
>Use !analyze -v to get detailed debugging information.
>
>BugCheck C4, {62, 8231de18, 8231ddc0, 20}
>
>Probably caused by : ipoib.sys
>
>Followup: MachineOwner
>---------
>
>nt!RtlpBreakWithStatusInstruction:
>80871648 cc int 3
>0: kd> dp ViBadDriver L1; dS @$p
>808ab138 8231ddd0
>8231de18 "ipoib.sys"
>0: kd> !verifier 3 ipoib.sys
>
>Verify Level fb ... enabled options are:
> special pool
> special irql
> all pool allocations checked on unload
> Io subsystem checking enabled
> Deadlock detection enabled
> Enhanced Io checking enabled
> DMA checking enabled
>
>Summary of All Verifier Statistics
>
>RaiseIrqls 0x0
>AcquireSpinLocks 0x955
>Synch Executions 0x0
>Trims 0x39
>
>Pool Allocations Attempted 0x34
>Pool Allocations Succeeded 0x34
>Pool Allocations Succeeded SpecialPool 0x34
>Pool Allocations With NO TAG 0x0
>Pool Allocations Failed 0x0
>Resource Allocations Failed Deliberately 0x0
>
>Current paged pool allocations 0x0 for 00000000 bytes
>Peak paged pool allocations 0x0 for 00000000 bytes
>Current nonpaged pool allocations 0x20 for 00000E40 bytes
>Peak nonpaged pool allocations 0x30 for 0044B300 bytes
>
>Driver Verification List
>
>Entry State NonPagedPool PagedPool Module
>
>8231ddc0 Loaded 00000e40 00000000 ipoib.sys
>
>Current Pool Allocations 00000000 00000020
>Current Pool Bytes 00000000 00000e40
>Peak Pool Allocations 00000000 00000030
>Peak Pool Bytes 00000000 0044b300
>
>PoolAddress SizeInBytes Tag CallersAddress
>82638f80 0x00000080 Ddk b84ff776
>82df4f80 0x00000080 Ddk b84ff776
>82612f80 0x00000080 Ddk b84ff776
>82e20f80 0x00000080 Ddk b84ff776
>82de2f80 0x00000080 Ddk b84ff776
>828f2f80 0x00000080 Ddk b84ff776
>82736f80 0x00000080 Ddk b84ff776
>82902f80 0x00000080 Ddk b84ff776
>827e4f80 0x00000080 Ddk b84ff776
>830e2f80 0x00000080 Ddk b84ff776
>82664f80 0x00000080 Ddk b84ff776
>82880f80 0x00000080 Ddk b84ff776
>8250af80 0x00000080 Ddk b84ff776
>82474f80 0x00000080 Ddk b84ff776
>82642f80 0x00000080 Ddk b84ff776
>831eef80 0x00000080 Ddk b84ff776
>831ecff8 0x00000008 Ddk b84ff776
>831eaf80 0x00000080 Ddk b84ff776
>831e8fe0 0x00000020 Ddk b84ff776
>831e6f80 0x00000080 Ddk b84ff776
>831e4fe0 0x00000020 Ddk b84ff776
>831e2f80 0x00000080 Ddk b84ff776
>831e0fe0 0x00000020 Ddk b84ff776
>831dcf80 0x00000080 Ddk b84ff776
>831dafe0 0x00000020 Ddk b84ff776
>831d8f80 0x00000080 Ddk b84ff776
>831d6fa0 0x0000005c Ddk b84ff776
>831d4f80 0x00000080 Ddk b84ff776
>831d2ff8 0x00000008 Ddk b84ff776
>831d0f80 0x00000080 Ddk b84ff776
>831ceec0 0x0000013c Ddk b84ff776
>831ccf68 0x00000098 Ddk b84ff776
>
>Using the address, you can show the disassembly using the 'u' command:
>
>0: kd> u b84ff776
>ipoib!__cl_malloc_priv+0x96
>[c:\dev\openib\openib\base\core\complib\kernel\cl_memory_osd.c @ 51]:
>
>This actually highlights an issue in complib's memory tracker - it obscures
>the
>actual caller by providing a wrapper to the system allocation calls, so all
>allocations point to the internal complib allocator, rather than the
>caller.
OK, so I guess this can be safely removed.
>
>> >> 2) I have moved the place in which the adapter is being created.
>It
>> >> should better be created when the number of adapters in the list goes
>> >> from 0 to 1, This is the case in which there is someone holding the
>> >> device, the IPOIB ports are being disabled and re-enabled again.
>> >
>> >You mean when you register the device with NDIS for IOCTL access, or the
>> >IPoIB adapter object? Can't we register the device from DriverEntry,
>and
>> >deregister it from the unload handler? The docs for the calls seem to
>> >indicate that's the common usage model. I don't think the system will
>> >invoke the unload handler until all file handles have been closed. This
>> >would simplify things a great deal, wouldn't it?
>>
>> As far as what I saw, DriverUnload will not be called until we close the
>> device, so although I agree that this is simpler, it doesn't work. (you
>can
>> play with it of course, maybe you will find something).
>
>Ok, I'll play with this and let you know if I can get it to work.
Good luck.
>
>> >> 3) Please add the case of IRP_MJ_INTERNAL_DEVICE_CONTROL to the
>> >> function __ipoib_dispatch. It should be handled just like
>> >> IRP_MJ_DEVICE_CONTROL.
>> >
>> >Why don't kernel drivers use IRP_MJ_DEVICE_CONTROL? Why use
>> >IRP_MJ_INTERNAL_DEVICE_CONTROL?
>>
>> I'm not sure why IRP_MJ_DEVICE_CONTROL doesn't work for drivers (I
>believe
>> that it should). I'm still trying to understand if I should use
>> IoGetDeviceObjectPointer to open the device or ZwOpenFile. This might be
>the
>> cause of the problem (I'm currently using IoGetDeviceObjectPointer).
>
>I think IoGetDeviceObjectPointer is the right one. Are you building the
>IRP
>yourself, or using IoBuildDeviceIoControlRequest? I don't think ZwOpenFile
>will
>work, as I haven't seen any docs to describe how to send IOCTL requests to
>a
>file in the kernel. I think you have to use IoCallDriver on a device
>object,
>which ZwOpenFile doesn't give you.
>
I have made some more searched and I have found out an example in the DDK
that says you should use ZwOpenFile.
Look in the file WINDDK\3790.1830\src\network\ndis\ndiswdm\init.c
It also shows how to get the device object in order to send IOCTLS.
So it seems to me that this problem is over, although I still have to find
how to close the device when IPOIB is trying to get down. I have some ideas
and I'll try them tomorrow, if I have time.
>- Fab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20050919/fb0a4cd2/attachment.html>
More information about the ofw
mailing list