[ofw] RE: NetworkDirect over WinVerbs
Sean Hefty
sean.hefty at intel.com
Tue Feb 10 16:55:48 PST 2009
>> The error code can change, but the time out of the DREQ does indicate
>> that a disconnect message was not received from the remote side.
>
>TIMEOUT to me means try again, you might have better luck. I think in this
>case, a DREQ timeout means the other side is toast, and is really more like a
>successful disconnect.
To me, if the operation includes retries, timeout indicates a failure. This is
not the same as a successful disconnect, since you can't determine what has
failed. It could be a switch or router that's causing the problem and not the
remote system.
>In the case where you have both sides handshaking to disconnect, you *will* run
>into issues where one side receives the last message, processes it, and moves
>the QP to error before the sender's HW has received the ACK from the receiver's
>HW. When you move the QP to error, the HW stops generating ACKs/retries/etc
>for that QP, leaving the sender to timeout (which it eventually does) and the
>send completes in error (retry exceeded.)
>
>Not at all what an app would expect, but that's what the HW does.
>
>So the QP transition to error needs to happen after the IB-level disconnection
>happens. Sure, you could add an arbitrary delay and hope things would work, but
>the disconnect handshake at the HW level allows things to tear down politely -
>the client on each side can be expected to not call Disconnect until all sends
>have completed locally. This can delay the DREP (and thus the QP transition at
>the sender) properly.
See the 2-army problem. How would you expect this to operate over TCP, where
disconnect messages are exchanged in-band?
- Sean
More information about the ofw
mailing list