[openib-general] RFC userspace CMA

Tom Tucker tom at opengridcomputing.com
Wed Oct 26 10:47:32 PDT 2005


On Wed, 2005-10-26 at 08:41 -0700, Sean Hefty wrote:
> Tom Tucker wrote:
> > FYI, I've started writing the iw_cm that sits below the rdma_cm. Here's
> > the general picture I have in mind.
> > 
> > 	+---------+
> > 	| RDMA CM |
> > 	+-+-----+-+
> > 	  |     |
> >      +----+     +----+
> >      |               |
> > +---------+     +----+----+
> > | IB CM   |     | IW CM   |
> > +----+----+     +----+----+
> >      |               |
> >  ____+_____      ____+_____
> > +---------+|    +---------+|
> > | IB devs ||    | IW devs ||
> > +---------+     +---------+
> 
> This is what I was envisioning as well.
> 
> > I am also migrating the current iw_cm.h file to match the interfaces in
> > the rdma_cm more closely. 
> 
> Note that there are still some changes occurring to the rdma_cm to support 
> userspace.  I'm concerned about how well these changes map to iWarp, since the 
> changes expose the three-way CM handshake used by IB.
> 
> > In general, the IW CM methods look very much like sockets connect,
> > listen, and accept. There is an iw_cm_id like the ib_cm_id that
> > encapsulates the 5-tuple, a callback for IW CM events and a "provider
> > handle" that represents the adapter "connection cookie". The iw_cm_id is
> > passed to connect, accept, etc...
> 
> Something that didn't make sense for the kernel rdma_cm running over IB was 
> adding a backlog parameter to the listen request.  (The IB CM is callback 
> driven, so there's not really a backlog.)  I will probably add this to the 
> userspace API.  Does iWarp need a backlog parameter in the kernel?

It is needed by some adapters. For the AMSO1100 it's passed down to the
adapter to reserve syn cache entries for incoming connections. 
> 
> > depending on the model. This means that calls like listen with a local
> > port wildcard can't return until the "listen_reply" comes back from the
> > adapter.
> 
> I didn't quite follow this.  Right now, the rdma_cm only tries to support 
> wildcard IP addresses.  Are you wanting to support listening on any port as 
> well?  What is a listen_reply?
> 

Yes it's funky. Basically, the listen in this context is the combination
of a 'bind' and a 'listen'. If you specify 0 for a port number on bind,
the stack will allocate one for you. 

MPI uses this to allocate a port and then advertises this port to a
central server (node of rank 0) who tells the other servers how to
contact each other.  This avoids having to allocate a well known port
for each node in the MPI cluster and allows multiple apps to run
concurrently without allocating additional well-known ports.

The listen_reply from the adapter returns the port chosen and the status
of the listen request. 

It is the somewhat analagous to the insert_listen_.... in the IB CM
framework.


> - Sean
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



More information about the general mailing list