[ofw] CM ref counting issues...

Sean Hefty sean.hefty at intel.com
Thu Dec 10 10:34:25 PST 2009


There appears to be a deadlock condition being reported by these pseudo-traces.
This also appears to be a separate problem than the IB CM hang.

cm callback (already done, but listed for details):
winverbs!WorkQueueInsert (called after WdfIoQueueRetrieveNextRequest, not from)
winverbs!WdfIoQueueRetrieveNextRequest
winverbs!WvEpCompleteDisconnect
winverbs!WvEpIbCmHandler

wq thread:
winverbs!WvProviderDisableRemove+0x4a
winverbs!WvQpAcquire+0x2a
winverbs!WvEpDisconnectQp+0x31
winverbs!WvEpDisconnectHandler+0xcd
winverbs!WorkQueueHandler+0x4b
nt!IopProcessWorkItem+0x27

cleanup thread:
winverbs!WdfIoQueuePurgeSynchronously
winverbs!WvEpFree+0x74
winverbs!WvProviderCleanup+0x76
Wdf01000!FxPkgGeneral::OnCleanup+0x82
Wdf01000!FxPkgGeneral::Dispatch+0x1ce
Wdf01000!FxDevice::Dispatch+0xa9
nt!IopCloseFile+0x184

The cleanup thread is holding an exclusive 'lock' during WvProviderCleanup.
This prevents the wq thread acquiring it from WvProviderDisableRemove for shared
access.  It appears that even though the cm callback removed the WdfRequest from
the endpoint queue, the endpoint queue is still aware of it.  The cleanup thread
blocks in WdfIoQueuePurgeSynchronously, and my guess is that it's waiting for
the missing WdfRequest to complete, which it can't because the cleanup thread is
blocking the wq thread from making progress.

If this is the case, I can manually purge the queue instead of calling
WdfIoQueuePurgeSynchronously.  But does anyone know if this is indeed the case,
or is the entire usage model of removing a WdfRequest from a queue without
explicitly completing, forwarding, canceling, or requeing it invalid?




More information about the ofw mailing list