[openib-general] RFC on CM error handling
Libor Michalek
libor at topspin.com
Thu Jan 20 13:02:12 PST 2005
On Thu, Jan 20, 2005 at 11:21:49AM -0800, Sean Hefty wrote:
> Libor Michalek wrote:
>
> > What sort of failure did you have in mind, synchronous or async? I
> > would think that if a call that is going to change-state, like send_rep
> > fails, the reject is going to fail as well. If it's an async failure
> > then it is reported as a transition into idle, right?
>
> I was referring to synchronous errors. And, you're right, if a call
> like send_rep fails, then the reject is likely to fail as well. The
> difference is that reject will force the state to idle/timewait.
I suppose it makes sense to allow the consumer to make the REJ request
or wait and retry the REP.
> > Another question, should I expect a REJ to be generated if the REQ or
> > REP handler returns an error? From the code it doesn't look like that
> > behaviour currently exists.
>
> This is part of the reason why I brought this up. Currently, the code
> will not generate a reject on a failure sending a REP or RTU. But the
> communication identifier state doesn't change. This allows the user to
> reject the connection. Note that if the user simply destroys the
> communication identifier at this point, the CM will reject the connection.
Actually, I was thinking about returning an error from the REP/REQ
handler, this I believe is suppose to destroy the communication identifier,
which I presume will generate a REJ? This is at least the behaviour I would
like to see. In this case, since the communication identifier is destroyed
there wouldn't be any other state transitions.
Yet another issue. Saving the info from the incomming CM message,
specifically REQ and REP, for later QP transitions is something that's
going to be duplicated quite a bit. I would really like to see this
data in the communication identifier itself, so it can be plucked out
later for the QP modifies. To take it one step further, Roland had the
suggestion, that it would be really nice to have CM conveniance functions,
to which you pass a cm_id and qp_attr structure and it assigns all the
correct values and masks for a given transition. For example:
int ib_cm_prep_rtr(struct ib_cm_id *, struct ib_qp_attr *);
int ib_cm_prep_rts(struct ib_cm_id *, struct ib_qp_attr *);
This would reduce a lot of redundent code in CM consumers, and still
provide the fleaxability of allowing the CM user to customize the QP
attr before making the transition. What do you think?
-Libor
More information about the general
mailing list