[Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

Grant Grundler iod00d at hp.com
Fri May 27 22:09:16 PDT 2005


On Fri, May 27, 2005 at 08:04:26PM -0700, Caitlin Bestler wrote:
> > You will never get a guarantee your code will go into 
> > kernel.org regardless of which path you take.
> > I've seen three of the lead developers (a) promise to review 
> > any proposed code changes and (b) indicate they are willing 
> > to merge in support for RNICs.
> 
> A commitment that after merger that the API will be transport
> neutral is essential. That is, IB HCAs do not have to emulate
> iWARP. iWARP HCAs do not have to emulate IB.

"commitment" is a heavy word. One that should only be
used to describe individual actions in an open source forum.

I think you've already got agreement (not commitment)
that this is a reasonable goal - it totally depends on
what the proposed patches look like and if the person doing
the work is trustworthy and competent.  The folks running the
openib source tree are a very accomodating and helpful crowd.

Don't get discourage by initial negative feedback. Once you
"get it" (process wise) things should move pretty quickly. 

Sounds like you've never worked on an opensource project
before. Have you read Documentation/SubmittingPatches
in your favorite source tree?
Do you want other pointers on "how to collaborate on
open source projects"?


> I fail to see why an advance agreement about the goal cannot
> be reached. The iWARP vendors are willing to concede that
> current Gen2 verb modules still have to be supported in a
> merged solution.

That's a good first step. I think you've already gotten
as much advance agreement as anyone is willing to give.
Trust me, it doesn't get much better than what you and
Venkata (IBM) have been promised already.

> What is your definiton of a merged solution?

One that starts with a code change proposal. Start with
any existing file in openib.org tree and suggest changes
that would make it fit your needs better.

The more specific, the better.
One piece at time.

> Is there a commitment
> that "RDMA Services" does not mean "InfiniBand, and the iWARP
> vendors have to pretend to be InfiniBand but we'll add one or
> two fields for problems they *can't* solve."

No. This is open source. We depend on maintainers to use good
judgement and listen to other people. Projects without such
leaders don't survive. Linus has a cute, short essay on
that in Documentation/ManagementStyle. Please read it.
It's quite amusing and insightful at the same time.

> > Look at the relevant openib.org header files (e.g. ib_verbs.h,
> > ib_user_verbs.h) and make a list of changes needed for it to 
> > be useful to an RNIC.  Just pointing at RNIC-PI isn't interesting.
> 
> Well since we are talking about what a final transport neutral
> API would look like, as a guide when writing new code that will
> be compatible with the final merger, then looking at an example
> of an API that attempts to be transport neutral strikes me as
> highly relevant.

Uhm, sort of. Yes, in that the design probably has relevance.
The implementation less so. This discussion is about starting
with openib gen2, right?

If it's not, I'm wasting my time.

> Is there anything in the general structure of RNIC-PI that you
> do not like? The IB-specific portions are different than Gen2,
> and that certainly would not be true of any final merged verbs.
> So far the only comments received are that RNIC-PI was not written
> to be inside the Linux Kernel. That's correct, it was designed to
> be an OS neutral API that could be supported on a wide variety
> of Operating systems. Changes *have* been made, such as always
> having struct/enum/union tags, precisely to guarantee that a 
> Linux version of rnicpi.h can be written that will be interoperable
> with code written to the man pages.

I barely have time to keep up with code changes going on
in openib.org tree and don't understand details of 1/2 of those.
I'm the wrong person to ask even about specifics in openib.org.
My expertise is not in IB transport (it's PCI chipsets and
platform IRQ/DMA support.)

I've worked full time on open source projects for the past 5 years and
have a clue how they do (and don't) work. Some definitely work better
than others. It always revolves around the people leading the project.
openib.org is doing fairly well right now IMHO.

> As for gen2, here's one minor example:

Good - that's a start.
ISTR Someone already pointed out some attr fields need to change.

> struct ib_qp_attr {
> 	enum ib_qp_state	qp_state;		// States
> defined are IB specific

Please don't use C++ style comments.

I talk about how this field (and others) *might* be organized below.


> 	enum ib_qp_state	cur_qp_state;
> 	enum ib_mtu		path_mtu;		// The path MTU
> is not a QP attribute in iWARP,

It's ok if *some* fields are unused I think.
(not that this needs to be here)

> 							// The LLP
> maintains that. The maximum DDP Segment
> 							// size is
> exported to the DDP layer, but not to 
> 							// the consumer.

Please try to keep lines shorter than 80 columns.

> 	enum ib_mig_state	path_mig_state;	// Path migration is
> handled at L2 or L3, it is 
> 							// not visisble
> to an iWARP QP.
> 	u32			qkey;			// There is no
> qkey
> 	u32			rq_psn;		// There is no psn.
> There is an MSN for send/recvs.
> 							// It is not
> normally exported to the consumer.

Is this a kernel or user space header file?

> 	u32			sq_psn;		// There is no psn.
> 	u32			dest_qp_num;	// The destination QP is
> not known.
> 	int			qp_access_flags;  // presumably mostly
> compatible, but its not documented
> 							// in the .h
> file	

Right - you (or someone from rdmaconsortium) needs to track down usage
and verify.

> 	struct ib_qp_cap	cap;			// different
> capacities are controllable between
> 							// iWARP and IB.
> RNIC-PI defined the union of the
> 							// two, and left
> it to the providers to simply replicate
> 							// counts that
> are redundant under their transport.
> 							// Example:
> RDMAC allows different sge capacities
> 							// depending on
> the type of work request (send vs. write).
> 	struct ib_ah_attr	ah_attr;		// entirely IB
> specific information. TCP equivalents are
> 							// not part of
> QP state.
> 	struct ib_ah_attr	alt_ah_attr;	// see ah_attr
> 	u16			pkey_index;		// VLANs are the
> equivalent, and are managed below the
> 							// RDMA layer
> 	u16			alt_pkey_index;	// ditto
> 	u8			en_sqd_async_notify;	// There is no
> sqd state
> 	u8			sq_draining;
> 	u8			max_rd_atomic;		// there are no
> atomics
> 	u8			max_dest_rd_atomic;	
> 	u8			min_rnr_timer;		// Retries and
> pauses are handled below the RDMA layer
> 	u8			port_num;			//
> Egress ports are managed at the IP layer, not as 
> 								// part
> of the RDMA layer. Egress ports can be 
> 								//
> reassigned without changing the QP's state
> 								// at
> the RDMA layer
> 	u8			timeout;			//
> managed by the LLP (TCP or SCTP)
> 	u8			retry_cnt;			//
> managed by the LLP (TCP or SCTP)
> 	u8			rnr_retry;			//
> managed by the LLP (TCP or SCTP)
> 	u8			alt_port_num;		// see port num
> 	u8			alt_timeout;		// see timeout
> };

Sounds like this struct is a good place start seperating
generic (subsystem) and transport specific parts.

There are lots of examples of this in the linux kernel already.
Shouldn't be that big of a deal since "struct ib_qp_attr" is
only referenced in 5 files under drivers/infiniband in my source tree.

I also expect some of those consumers could move out of the generic
code into some IB specific part of the source tree (if possible).

> In a merged transport neutral API virtually the entire current struct
> would become part of an ib union. Just adding comments that this field is "IB
> only"
> doesn't achieve neutrality, it makes iWARP developers wade through
> structs
> to determine which fields can be effected, and makes it all too likely
> that
> an application developer will set a field not realizing that an iWARP
> RNIC
> will not look at that field.

My preferred way to deal with this is to allow the transport to allocate
extra space behind the generic part that is only for use by the
transport specific code. Transport specific header files would have
something like this in them:

struct ib_qp_attr {
	struct rdma_qp_attr;	/* generic stuff */

	...IB specific qp attr...
};

The transport driver (maybe rdma interface driver like mthca for IB)
could allocate it's control structures using this.
Make sense?

> If a transport neutral API based on unioning everything that is not
> transport
> neutral is not acceptable as the goal of eventual merger is not
> acceptable
> then the emphasis should be on kDAPL as the transport neutral API.

I don't think I parsed this sentence correctly.
But I'm really not a fan of unions. Just my $0.02.
They should only be needed when knowledge of different
layers *must* be shared across modules.
I'm hoping they aren't necessary here because;
o it means test and a branch for most accesses in generic code
o we didn't split the generic/transport specific code right

But I'm not the maintainer and other people will know if
they are needed or not.

hth,
grant



More information about the general mailing list