[ofw] RE: NetworkDirect over WinVerbs

Fab Tillier ftillier at windows.microsoft.com
Tue Feb 10 22:08:54 PST 2009


>> Yes, ND::Disconnect does the equivalent of
>> WV::NotifyDisconnectAndWhenThatHappensModifyQpToError followed by
>> WV::Disconnect.
>  nit: WV:Disconnect() is part of the initial operation, since it sends
> the disconnection message.  Although, I'm still not sure that this is
> really what ND:Disconnect() is supposed to do, which is part of the
> problem.

ND:Disconnect is supposed to disconnect and flush all outstanding requests.  It won't be called by msmpi unless the close handshake has been performed.

With ND, the following logic works on both active and passive sides if the app has a close handshake (which msmpi does, hence why it doesn't use NotifyDisconnect):

1. user handshake to disconnect
2. user calls Disconnect()

ND:NotifyDisconnect is optional - not necessary to gracefully disconnect.

> Additionally, winverbs always calls Modify() in the context of the
> user's thread.

ND::Disconnect is always called in the context of the user's thread.  In any case, it doesn't matter - the output buffer will always be provided in the context of the user's thread, but there's no requirement for the modification to be done in the context of the user's thread.  I'll even assert that requiring QP modification to be done in the user's thread context limits scalability and is a lousy design.

> Besides avoiding assumptions about the implementation, this is huge when
> trying to ensure proper synchronize.  Otherwise, you can end up trying
> to synchronize CM upcalls with QP downcalls, like destruction, and
> device removal handling. I'm *very* reluctant to changes in this part of
> the kernel code.  I do not want to get into the QPs and CEPs referencing
> each other or adding complex synchronization (that likely won't work).

You don't need them to reference one another at all.  In fact you should avoid it.  But there's no reason both the QP handle and the CEP handle can't be provided in the same IOCTL, that first performs the CEP operation, and when that completes, performs the QP operation.  The two objects are still independent, the IOCTL just has multiple phases, with each phase operating on a different object.

> ND seems to want ND:NotifyDisconnect to complete only when receiving a
> DREQ, and

ND:NotifyDisconnect is not required to disconnect.  If both sides of the connection negotiated to disconnect, both sides can call Disconnect once their outstanding sends are complete.

> ND:Disconnect to complete after modify QP completes after receiving a
> DREP or DREQ timeout.

Right, ND:Disconnect sends either DREQ or DREP (depending on whether DREQ has been received) and when DREP is received, DREQ times out, or DREP is sent, moves the QP to error.



More information about the ofw mailing list