[ofw] [RFC] [PATCH 1/2] work_queue: new abstraction for queuing work

Thu May 21 11:47:51 PDT 2009

>> While IO_WORKITEMs can't be embedded in structures pre-Vista, you can
>> allocate a work item and store the pointer to it pre-Vista.
> 
> Long story short, I think a WORK_ENTRY type structure is still needed.
> 
> The IO_WORKITEM is associated with a WDFREQUEST.  A structure is still
> needed to wrap around the WDFREQUEST and the IO_WORKITEM, because the
> WorkItem() callback only provides a single context pre-Vista.  (Any work
> item 'Ex' functions are essentially unusable for our purposes, as they
> require Vista.)

What the ND proxy does is store the additional context values in the IRP, and then use the IRP as the context for the call.  For the Disconnect case, what you want in the work item callback is the EP and QP handles so you can do a reference check.  Those should already be part of the IRP parameters.  Since you can't unqueue a work item, there's no need to store a reference to it in the IRP.  The only thing you need to do is prevent IRP cancelation while the work item is queued, since it will reference the IRP when the callback runs.  However, the time it takes from queuing the work item to running it should be short.

> The processing of the WDFREQUEST in the system thread provides the
> necessary synchronization to handle object destruction and device
> removal.  This involves acquiring and releasing the WV_ENDPOINT or
> WV_QUEUE_PAIR, which must be done at passive level.  The same checks
> cannot be done from the CM callback at dispatch without completely
> reworking all of the locking.  And because of device removal handling,
> I'm not even sure that's doable.

>From the CM callback, you know that the EP is still valid and you can check to see if you have a disconnect request queued.  If so, you can remove it from the queue (so it can't be cancelled), and queue the work item using the request as context.  From the work item callback, reference the EP and QP (since they now could have been destroyed) based on the IOCTL parameters (just like you did when you first received the IRP), and assuming they're valid, move the QP.  You actually might not need to reference the EP, just the QP might do.

> By the time the CM callback is invoked, the EP and/or QP could be in the
> process of being destroyed.  I want to allow this because of how long it
> could take before the CM callback fires in the case of a DREQ timeout,
> which could be on the order of a couple of minutes.

That shouldn't be a problem to do if you use the IRP as the context for the work item callback.

>> Just store the work items (by reference) in the objects that need
>> them.
> 
> This requires taking the reference on the object during the Disconnect
> down call.  Any attempt to destroy the endpoint or a handle device
> removal would need to delay until a DREP was received or the DREQ timed
> out.  I don't like that the disconnect processing ends up taking a
> reference on the EP asynchronously at all, since it can result in EP
> destroy failing.

No, you could have a pool of work items that you would pull from when you get the CM callback and need to queue it.  If the pool is empty, you could allocate new work items dynamically since that is valid operation at DISPATCH.

No need to hold the reference on the EP, as long as you keep the IRP queued in the EP.  If the EP gets destroyed, the IRP will be cancelled.  If the CM callback checks for the presence of the IRP and dequeues it, the IRP can no longer be canceled.  Set the IRP as context for the callback, queue, and let things go on.  No extra references needed, and you can take new references once the work item callback runs.

> I will see if I can extend the EP connect state information to simplify
> things, while still avoiding holding a reference on the EP while the
> disconnect is in progress.

All you need to know, is that when you receive a DREQ or DREP, you check for the disconnect IRP.  If you find it, you queue the work item to move the QP to error.  Am I missing something?

>> The IO_WORKITEM documentation warns against infinite loops in a work
>> item handler.  See the section "System Worker Threads" in the WDK docs,
>> the last paragraph: "If one of these routines runs for too long (if it
>> contains an indefinite loop, for example), the system can deadlock."
> 
> From the WDK: KMDF documentation "Using Framework Work Items":
> 
> 	Creating and Deleting Work Items
> 	Drivers can use one of the following two techniques to
> 	create and delete work items:
> 
> 	* Use each work item once: create the work item when you need it
> 	  and delete it immediately after it is used.  This technique is
> 	  useful for drivers that require infrequent use (less often than
> 	  once per minute) of a small number of work items...
> 
> 	* Create one or more work items that your driver requeues as 	 
> necessary. This technique is useful for drivers that use work 	  items
> frequently (more often than once per minute), or if your 	  driver's
> EvtInterruptDpc callback function cannot easily handle 	  a
> STATUS_INSUFFICIENT_RESOURCES return value from WdfWorkItemCreate.
> 
> See WdfWorkItemCreate as a starting point and follow the links from
> that.  I'm guessing the Wdf* are simply wrappers around the IO work
> items calls. I didn't use the Wdf* calls, because the work item ones
> looked simpler and more efficient to me.

The Wdf* calls make extra checks and have more overhead than calling IoAllocateWorkItem directly.  Nothing prevents you from keeping a pool of work items, though, without layering your own abstraction over them.

-Fab