[ofw] RE: NetworkDirect over WinVerbs

Sean Hefty sean.hefty at intel.com
Tue Feb 10 16:55:48 PST 2009


>> The error code can change, but the time out of the DREQ does indicate
>> that a disconnect message was not received from the remote side.
>
>TIMEOUT to me means try again, you might have better luck.  I think in this
>case, a DREQ timeout means the other side is toast, and is really more like a
>successful disconnect.

To me, if the operation includes retries, timeout indicates a failure.  This is
not the same as a successful disconnect, since you can't determine what has
failed.  It could be a switch or router that's causing the problem and not the
remote system.

>In the case where you have both sides handshaking to disconnect, you *will* run
>into issues where one side receives the last message, processes it, and moves
>the QP to error before the sender's HW has received the ACK from the receiver's
>HW.  When you move the QP to error, the HW stops generating ACKs/retries/etc
>for that QP, leaving the sender to timeout (which it eventually does) and the
>send completes in error (retry exceeded.)
>
>Not at all what an app would expect, but that's what the HW does.
>
>So the QP transition to error needs to happen after the IB-level disconnection
>happens. Sure, you could add an arbitrary delay and hope things would work, but
>the disconnect handshake at the HW level allows things to tear down politely -
>the client on each side can be expected to not call Disconnect until all sends
>have completed locally.  This can delay the DREP (and thus the QP transition at
>the sender) properly.

See the 2-army problem.  How would you expect this to operate over TCP, where
disconnect messages are exchanged in-band?

- Sean




More information about the ofw mailing list