[ofa-general] ib_cm question

Terry Greeniaus tgree at relay.phys.ualberta.ca
Mon Sep 8 10:14:46 PDT 2008


On Sat, 6 Sep 2008, Hal Rosenstock wrote:
> On Fri, Sep 5, 2008 at 3:38 PM, Terry Greeniaus
> <tgree at relay.phys.ualberta.ca> wrote:
> 
> The CM maintainer is currently on sabbatical for a little while. FWIW,
> I'll provide my take on this.

Thanks for your response Hal!

> > Client                  Server
> > REQ ------------------->
> > w/ bad key
> >
> >    <------------------- REJ
> >                     w/ good key
> >
> > REQ ------------------->
> > w/ good key
> >
> >                         REP/etc.
> 
> Are the keys in the private data ?

Yes.

> Out of curiousity, what REJ code is used ?

28 - consumer reject.

> ib_cm.h states:
>  * ib_cm_handler - User-defined callback to process communication events.
>  * @cm_id: Communication identifier associated with the reported event.
>  * @event: Information about the communication event.
>  *
>  * IB_CM_REQ_RECEIVED and IB_CM_SIDR_REQ_RECEIVED communication events
>  * generated as a result of listen requests result in the allocation of a
>  * new @cm_id.  The new @cm_id is returned to the user through this callback.
> 
> Although some other CM's may have reused the same "cm id" on the
> passive side, I don't think that there's a requirement to do so. I
> think it's valid either way per the spec. IMO the unit test/protocol
> should not depend on implementation specific behavior which is what I
> think this amounts to.

You may be right here.  The passive side of the CM state machine diagram
(Fig 132 p 688 in my copy of the IBA) has an arc from "REJ Sent" to "REQ
Rcvd" labelled "(retry) Rcv REQ".  It also has an arc labelled "(no
retry)" which essentially frees up the cm id.  Unfortunately the spec
doesn't specify which arc you should follow.  We had interpreted this as
the passive side waiting for a retried REQ if the number of CM retries
as specified in the original REQ packet had not yet been exhausted.  
However, with dropped packets (UD) this could result in the passive cm
id never being freed - so the OFED interpretation of freeing it
immediately after sending the REJ and using a new cm id for subsequent
REQs may be the more sensible interpretation.

> I don't sufficiently understand the details of your protocol (as to
> why the initial connection need be rejected) as opposed to passing the
> key back in the REP. There may also be other possibilities if a
> protocol change for your application is feasible.

The protocol is completely contrived for this particular unit test - it 
isn't used anywhere in our application and was meant to test these 
particular state transitions in our CM implementation.  It's comforting 
to know that they did their job and found this difference in how the two 
CMs work, but I will argue that it shouldn't be run against the OFED 
stack or that it should be modified to take the OFED interpretation of 
the spec into consideration.

Thanks for your time,
TG



More information about the general mailing list