[openib-general] RFC on CM error handling

Fri Jan 21 10:25:41 PST 2005

On Thu, Jan 20, 2005 at 08:31:12PM -0800, Sean Hefty wrote:
> 
> Some thoughts about destroying cm_id's based on a callback return value.
> 
> Given the current code structure, it's possible for a user to receive
> multiple simultaneous callbacks.  For example, a user could be notified of a
> reply and reject (sent as a result of an rtu timeout) at the same time.  The
> CM will handle the state transitions properly to fail calls into the API.
> But I can see where it would be easy for a user to end up trying to destroy
> the cm_id multiple times, or receiving the second callback after they had
> already destroyed their context.
>
> I'm not seeing an easy way to handle this condition.  I think that I need to
> serialize all callbacks for a given cm_id.

  I would think lack of cm_id serialization would be a bigger problem
for the CM itself, since it needs to update the cm_id state in a sane
manner. The consumer could serialize callbacks, I know I do, but it
would be a bigger problem if the state transitions didn't make sense.
The older CM used a model where the connection identifier was checked-out
from the table, and only thread of control had it at a given time. This
I think would solve a lot of issues.

-Libor