[Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMA APIs and ULPs for Linux

Michael Krause krause at cup.hp.com
Fri May 27 09:50:33 PDT 2005


At 08:05 AM 5/27/2005, Bernard Metzler wrote:

>Sukanta,
>
>without touching any TOE issues (this question is about RDMA, right?), after
>transforming a TCP connection into RDMA mode and using an RDMA API,
>the socket file descriptor is not longer to be used for communication.

Whether one uses a Socket FD or not is an application issue.  A QP has a 
handle identifier which may be mapped to a FD or may be used directly by an 
application.  There is no requirement that anything flow through the 
Sockets API - that is a choice for the application to make.

>In fact, on some implementations the stream socket resources will even get 
>released. That is, if the socket was the direct consumer of the TCP 
>stream, then now it is RDMAP/DDP/MPA. RDMA APIs such as IT-API defining a 
>specific call to convert a socket based connection into RDMA mode (e.g., 
>it_socket_convert()). Other consumers may directly start via an RDMA API, 
>never
>opening a consumer controlled socket.

Correct.

>So, in RDMA mode, communication will happen via the RDMA API. At this 
>stage, the kernel still have to keep completely in its hands the 
>synchronisation of state information related to that offloaded 
>connection(s) with the host stack (it would have to protect the local port 
>used by the offloaded connection for example, others are routing, ARP, 
>SNMP...), but it is not involved at the data path.

This is true only when the host and the RNIC share a common IP address.  If 
they are separate subnets, then there is no need to coordinate sans some of 
the reasons I noted in an earlier e-mail.   I think there is great value 
and customer need to have an infrastructure that can coordinate information 
and enable customers to use the existing tool chains to understand what is 
going on in a given device or endnode.

>With respect to the kernel based TCP stack, what is not needed is a hack 
>into the stack and scatter/gather state information of the live TCP 
>connection between kernel and RNIC, but to find one clean interface to 
>transfer state information out of that stack and to the RNIC.

At a minimum, one needs:

- A host network stack get state call to extract connection state and 
quiesce the port from the host's perspective.
- A host network stack set state call to populate the host structures with 
the associated state and enable the port
- A RNIC get state call to extract all transport and RMDA context
- A RNIC set state call to populate all transport and RDMA context

These calls should be "standardized" so that both sides of the 
infrastructure will be able to utilize without having to hack anything.

>With limited benefit, one could of course also implement native sockets 
>over RDMA, where an in-kernel midlayer on top of kernel
>RDMA Verbs is doing the translation between send(), receive() to 
>post_send, post_receive. But usage of 'true RDMA' operations
>like RDMA READ or WRITE might be limited, and I don't see much value for 
>the user here. One variety of this approach with less limited access to 
>RMDA benefits might be sockets with extended RDMA semantics.

Sockets with RDMA extensions was proposed a couple of years back by some of 
us.  There would be quite a bit of benefit as most developers know how to 
code to Sockets and the RDMA extensions are relatively simple when you 
think about it.  It would also eliminate the need to use SDP by placing the 
RDMA knowledge requirement on the application.  SDP has benefits for 
enabling existing Sockets applications to operate transparently over RDMA 
but the explicit use of RDMA extensions I think would provide greater 
benefit in the end.  While it is important to provide legacy investment 
protection, real innovation will not occur until people start developing 
technology that enables a smarter customer.  Simplification and performance 
also come with enabling a smarter customer.

Mike


>Bernard.
>
>rdma-developers-admin at lists.sourceforge.net wrote on 27.05.2005 15:40:43:
>
> > Venkata,
> >    How will that work? If the RNIC offloads RDMA and
> > TCP completely from the Operating System and does not
> > share any state information then the application
> > running on the host will never be in the position to
> > utilize the socket interface to use the communication
> > logic to send and receive data between the remote node
> > and itself. Some information needs to be shared. How
> > much of it and what exactly needs to be shared is the
> > question.
> >
> > Thanks
> > SG
> >
> > --- Venkata Jagana <jagana at us.ibm.com> wrote:
> > >
> > >
> > >
> > >
> > >
> > >
> > > rdma-developers-admin at lists.sourceforge.net wrote on
> > > 05/25/2005 09:47:00
> > > PM:
> > >
> > > > Venkata,
> > > > Interesting coincidence: I was talking with
> > > someone (at HP) today
> > > > who knows substantially more than I do about
> > > RNICs.
> > > > They indicated RNICs need to manage TCP state on
> > > the card from userspace.
> > > > I suspect that's only possible through a private
> > > interface
> > > > (e.g. ioctl() or /proc) or the non-existant (in
> > > kernel.org)
> > > > TOE implementation. Is this correct?
> > > >
> > >
> > > Not correct.
> > >
> > > Since RNICs are offloaded adapters with RDMA
> > > protocols layered on
> > > top of TCP stack, they do maintain the TCP state
> > > internally but
> > > it does not expose to the host. RNIC expose only
> > > RNIC Verbs interface
> > > to the host bot not TOE interface.
> > >
> > > Thanks
> > > Venkat
> > >
> > > >
> > > > hth,
> > > > grant
> > > >
> > > >
> > > >
> > >
> > -------------------------------------------------------
> > > > SF.Net email is sponsored by: GoToMeeting - the
> > > easiest way to
> > > collaborate
> > > > online with coworkers and clients while avoiding
> > > the high cost of travel
> > > and
> > > > communications. There is no equipment to buy and
> > > you can meet as often as
> > > > you want. Try it
> > >
> > free.http://ads.osdn.com/?ad_id=7402&alloc_id=16135&op=click
> > > > _______________________________________________
> > > > Rdma-developers mailing list
> > > > Rdma-developers at lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/rdma-developers
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam protection around
> > http://mail.yahoo.com
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by Yahoo.
> > Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
> > Search APIs Find out how you can build Yahoo! directly into your own
> > Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
> > _______________________________________________
> > Rdma-developers mailing list
> > Rdma-developers at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdma-developers
>_______________________________________________
>openib-general mailing list
>openib-general at openib.org
>http://openib.org/mailman/listinfo/openib-general
>
>To unsubscribe, please visit 
>http://openib.org/mailman/listinfo/openib-general
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050527/3b0be5af/attachment.html>


More information about the general mailing list