[openib-general] RFC on CM error handling

Fri Jan 21 11:43:04 PST 2005

On Fri, Jan 21, 2005 at 11:28:13AM -0800, Sean Hefty wrote:
> 
> An example issue that I'm thinking of is a user gets a reply callback. 
>   A reject is then received by the CM, and a second callback to the 
> user is initiated.  If the user tries to send an RTU, the call will 
> fail since the cm_id is in an invalid state.  If the user then returns 
> -1 from the callback, the CM will destroy the cm_id.  The destruction 
> will block while the reject callback completes.  Since the user 
> returned -1 from the reply callback, they may not be ready to handle 
> another callback.
>
> The fix that I'm working on should still allow multithreaded operation 
> inside the CM, but callbacks to the user will be serialized.  If a user 
> returns a non-zero value from a callback, no additional callbacks will 
> be generated.

  OK, that's the behaviour I would expect. However, in the example, even
if the user returns 0 from the REP callback, I wouldn't expect to see
the REJ after the REP has been processed. (or after the RTU has been sent)
The CM states updates for a connection and resulting callbacks would be
serialized, so the REJ after the REP would be discarded since it was
received in a CM state which does not allow rejects. Or is this incorrect?

-Libor