[openib-general] [RFC] [uCM] proposed API changes

Fab Tillier ftillier at silverstorm.com
Thu Aug 11 12:31:32 PDT 2005


> From: Sean Hefty [mailto:sean.hefty at intel.com]
> Sent: Thursday, August 11, 2005 12:10 PM
> 
> There is an issue with adding context.  When a connection REQ is received, a
> new
> kernel cm_id is created.  This cm_id doesn't have any context associated with
> it.  For kernel clients, this isn't a big deal, since all events associated
> with
> a single cm_id are serialized.  A kernel app can set the context as part of
> their REQ handling.

Serialize events for user-mode cm_ids, and allow the user client to set the
context from their REQ handler.  The latter is probably pretty easy to do, but
in and of itself doesn't solve the problem with the out-of-order events and
races between setting the context and receiving an event.

> Userspace clients will run into the same situation, where no context is
> defined.
> But events for the same cm_id are not serialized for userspace clients.  An
> app
> can receive a REJ event for a newly created cm_id that does not have a
context.
> (They can even process the REJ event before the REQ event is seen.)  Searching
> in this case is unavoidable.  I'm not even sure of the right way to handle
> this
> situation.

A search on a REJ isn't a big deal - it should be a rare case as it will only
occur if the remote side times out or aborts.  A client could ignore the REJ
because sending the REP will fail if a REJ was received.

> In a more generic sense, userspace clients need to be able to handle out of
> order events if they use multiple threads for event handling.  For example,
> MRA
> to a REQ, REP received, and REJ received events could all occur at the same
> time.  (In this case, a userspace context would be valid.)

If you allow the user to target a get_event call to a specific cm_id this
problem goes away.  If the user issues multiple requests against the same cm_id,
they need to be ready to deal with out-of-order event reporting.  This also
solves the context issue, since the REJ won't be reported until the user
requests an event from that specific cm_id.
 
- Fab




More information about the general mailing list