[Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

Caitlin Bestler caitlinb at siliquent.com
Fri May 27 16:39:23 PDT 2005


 

> -----Original Message-----
> From: Bob Woodruff [mailto:robert.j.woodruff at intel.com] 
> Sent: Friday, May 27, 2005 3:57 PM
> To: Caitlin Bestler; Roland Dreier; 'James Lentini'; 
> Arkady.Kanevsky at netapp.com
> Cc: Venkata Jagana; rdma-developers at lists.sourceforge.net; 
> openib-general at openib.org
> Subject: RE: [Rdma-developers] Re: [openib-general] OpenIB 
> and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux
> 
> Caitlin wrote, 
> >Both uDAPL and kDAPL were designed for *application* use.
> >Even kDAPL is more intended for use by a kernel daemon that 
> is loaded 
> >separately from the kernel than for use within the kernel itself.
> 
> kDAPL is intended as a kernel-level API
> for RDMA enabled fabrics. As it was initially written, it 
> does not meet the Linux coding style and that is why it is 
> being totally reworked as we speak to meet that goal. 
> 

One of the primary features of both uDAPL and kDAPL is a unified
event reporting mechanism. This feature was adopted by IT-API as
well.

The intent of this feature, which I am mildly familiar with, was
to simplify the writing of applications -- whether those applications
were written to run in user or kernel space.

This unification is nice, and in my opinion vital, for applications.
Roland had properly pointed out that some kDAPL features go beyond
what is stricly necessary to achieve transport neutrality. I agree.
Event unification is extremely valuable *for applications*, but it
is not *necessary* to provide a transport neutral definition of
RDMA services *within* the kernel.

In an ideal world the in-kernel API would provide transport and
vendor neutral definitions of reliable RDMA services, and as
little else as possible.

That said, if I were developing a kernel daemon today I would
definitely use kDAPL. It is defined today, and its "extras" do
not cost all that much, and largely duplicate things that my
application would have to do anyway.

> >An ideal API for use within the kernel would abstract as much as 
> >possible (without requiring emulation), and then have transport 
> >specific unions or enum values. It would hide no control options, 
> >merely provide common controls for common capabilities.
> 
> So for every new RDMA device type that comes along, you need 
> to add a new enum, and unions for device class specific stuff, etc. 
> Seems rather static and not easily extended. Not to mention 
> that testing nightmare when the thing has to support 20 
> different types of RDMA enabled devices.
> I think code like that could get pretty ugly pretty fast. 
> 
> I'd rather see a registration mechanism like what we already 
> have with DAPL that does not require any code changes to add 
> a new RDMA device/provider.  We have already proven that this 
> works in DAPL as I know if at least 3 providers, IB, Myrinet, 
> and RNIC (Ammasso) that were developed separately and were 
> able to co-exist without any changes (enums and device class 
> unions) in the DAT mid-layer. 
> I assume that this can also be done with kDAPL in the kernel, 
> but I defer to the DAPL experts to answer that one. 
> 

We are discussing a low level interface to be used by the most
privileged code within a system. Controls should not be hidden
from it.

Each transport should  be expected to standardize their own
transport specific controls, and should greatly minimize them
and use controls defined in RDMA terms as much as possible.
But there are InfiniBand-specific control options required
to fully control an InfiniBand HCA, and there are iWARP-specific
controls required to fully control an iWARP RNIC. If you don't
like it, go get the protocols respecified.

The DAPL code actually required *extensive* changes to support
iWARP (I did the first round of them myself). And the glue layer
required for an iWARP vendor is extensive precisely because it
must pretend to be an InfiniBand device to Sourceforge "common"
code. Implementing iWARP under an API designed for InfiniBand
is a mistake I do not wish to repeat.

The sourceforge DAPL requires extensive parallel data between
the DAPL layer and the verbs layer, that has a measurable impact
on system performance. RNIC-PI not only avoids requiring either
IB HCAs to pretend to be iWARP RNICs, or iWARP RNICs to pretend
to be IB HCAs, it also provides features such as kernel mode
completions and 'os_data' markers that eliminate the need for
parallel DAPl/verbs data structures.

Once this lowest-possible-RDMA-API is defined it will make it
possible for *most* applications to work with only transport
neutral fields and enums, and virtually all applications to
do so for their non-error paths. But such an API is not a few
minor tweaks away from the Gen2 verbs. Trying to sweep the
differences under the rug in a "low level API" is what 
produces truly ugly code.

But once there is an error, one of the requirements of a
"as low as possible" API is that it not conceal data from
the kernel. That means there will be transport dependencies
hidden in unions. It's inescapable.






More information about the general mailing list