[openib-general] RFC on CM error handling
Libor Michalek
libor at topspin.com
Fri Jan 21 11:43:04 PST 2005
On Fri, Jan 21, 2005 at 11:28:13AM -0800, Sean Hefty wrote:
>
> An example issue that I'm thinking of is a user gets a reply callback.
> A reject is then received by the CM, and a second callback to the
> user is initiated. If the user tries to send an RTU, the call will
> fail since the cm_id is in an invalid state. If the user then returns
> -1 from the callback, the CM will destroy the cm_id. The destruction
> will block while the reject callback completes. Since the user
> returned -1 from the reply callback, they may not be ready to handle
> another callback.
>
> The fix that I'm working on should still allow multithreaded operation
> inside the CM, but callbacks to the user will be serialized. If a user
> returns a non-zero value from a callback, no additional callbacks will
> be generated.
OK, that's the behaviour I would expect. However, in the example, even
if the user returns 0 from the REP callback, I wouldn't expect to see
the REJ after the REP has been processed. (or after the RTU has been sent)
The CM states updates for a connection and resulting callbacks would be
serialized, so the REJ after the REP would be discarded since it was
received in a CM state which does not allow rejects. Or is this incorrect?
-Libor
More information about the general
mailing list