[ofa-general] ib_cm question

Sat Sep 6 06:33:55 PDT 2008

Hi,

On Fri, Sep 5, 2008 at 3:38 PM, Terry Greeniaus
<tgree at relay.phys.ualberta.ca> wrote:
> Hello all,
>
> We are porting out application to run on the OFED stack. As part of the
> porting process, I have a series of CM unit tests that I need to get to
> run.  I am having trouble with one in particular.

The CM maintainer is currently on sabbatical for a little while. FWIW,
I'll provide my take on this.

> At a high level, the unit test implements a simple protocol for
> establishing a connection between a client and a server to test basic CM
> functionality.  The protocol uses the private data field of the CM
> packets to exchange a key that is generated randomly by the server on a
> per-connection basis.  Essentially, the client sends a REQ with a
> randomly chosen key which will not match the server's.  When the server
> initially receives a REQ for a particular connection, it generates a
> random key and compares it against the key stored in the REQ.  Since
> they don't match, the server sends a REJ back to the client, and the REJ
> contains the correct key in the private data field.  Finally, the client
> resends the REQ, this time with the correct key:
>
> Client                  Server
> REQ ------------------->
> w/ bad key
>
>    <------------------- REJ
>                     w/ good key
>
> REQ ------------------->
> w/ good key
>
>                         REP/etc.

Are the keys in the private data ?

Out of curiousity, what REJ code is used ?

> Everything works well until the second REQ is received at the server.
> It appears that instead of reusing the previous ib_cm_id, the OFED CM
> generates a new ib_cm_id to handle the second REQ.  The unit test thinks
> that a new connection attempt is being requested instead of a retry of
> the original attempt and so it generates a new random key, resulting in
> the protocol being unable to establish a connection.
>
> Is something like I have described above supported by the OFED CM?

ib_cm.h states:
 * ib_cm_handler - User-defined callback to process communication events.
 * @cm_id: Communication identifier associated with the reported event.
 * @event: Information about the communication event.
 *
 * IB_CM_REQ_RECEIVED and IB_CM_SIDR_REQ_RECEIVED communication events
 * generated as a result of listen requests result in the allocation of a
 * new @cm_id.  The new @cm_id is returned to the user through this callback.

Although some other CM's may have reused the same "cm id" on the
passive side, I don't think that there's a requirement to do so. I
think it's valid either way per the spec. IMO the unit test/protocol
should not depend on implementation specific behavior which is what I
think this amounts to.

I don't sufficiently understand the details of your protocol (as to
why the initial connection need be rejected) as opposed to passing the
key back in the REP. There may also be other possibilities if a
protocol change for your application is feasible.

-- Hal

> I can try and distill this down to a fairly short code example if that
> would make things clearer.
>
> Thanks,
> TG
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>