[openib-general] iWARP IP Stack Dependency (Was Convergence on common RDMA APIs and ULPs for Linux)
Caitlin Bestler
caitlinb at siliquent.com
Fri May 27 09:38:21 PDT 2005
> -----Original Message-----
> From: rdma-developers-admin at lists.sourceforge.net
> [mailto:rdma-developers-admin at lists.sourceforge.net] On
> Behalf Of Sukanta ganguly
> Sent: Friday, May 27, 2005 6:41 AM
> To: Venkata Jagana; Grant Grundler
> Cc: openib-general at openib.org; rdma-developers at lists.sourceforge.net
> Subject: Re: [Rdma-developers] Re: [openib-general] OpenIB
> and OpenRDMA: Convergence on common RDMA APIs and ULPs for Linux
>
> Venkata,
> How will that work? If the RNIC offloads RDMA and TCP
> completely from the Operating System and does not share any
> state information then the application running on the host
> will never be in the position to utilize the socket interface
> to use the communication logic to send and receive data
> between the remote node and itself. Some information needs to
> be shared. How much of it and what exactly needs to be shared
> is the question.
>
> Thanks
> SG
iWARP/TCP requires that when a QP is modified to the RTR or RTS state
that it be supplied an existing TCP connection.
At that time the TCP Connection will be modified so that it is
processed in MPA Mode, and so that it's payload is delivered to
and generated by the QP rather than by its current IP stack.
The degree that the Connection is "totally" handed over varies
by implementation. But in no case can the consumer rely upon
continued use of the TCP connection's socket to send or receive
data. In fact generally there is nothing that can be done with
the old socket after that.
One thing that is *not* changed is the routing of the TCP connection.
Therefore an RDMA device can only "convert" a TCP connection that
is already routed through it.
That does not mean that the RDMA Device (conventionally RNIC, almost
entirely equivalent of an IB HCA) has to have a companion TOE
service. It may have a companion NIC service instead, or also,
that acts as a plain Ethernet device for TCP connections that
have not been converted for RDMA processing.
So the existing TCP connection must be transferred from its
current IP stack to what is essentially an RDMA stack.
RNIC-PI has specified tha the RDMA stack itself MUST NOT
establish connections. That must be done by the Host stack,
or by a TOE stack that is already integrated with and accepted
by the Host stack. The RDMA Device MUST NOT circumvent the
host stack's ability to control acceptance of TCP connections.
That is one area where commonality between IB and iWARP stacks
would be valuable. The Host IP stack wouldn't want the IPoIB
stack creating unapproved connections behind its back either.
In nuts-and-bolts terms, this means that modify-qp-to-rts
accepts a socket handle. That socket handle must be for a
TCP connection that is already established and is already
routed through a physial port that the RNIC controls. It
might be a handle for a TCP connection fully under the control
of the host stack, or it may be a handle for a TCP connection
that is managed by a TOE. But the domain of the handle is the
host OS IP stack. You cannot reference a TCP connection that
is unkown to the IP stack; it is a handle that is being passed
in not a complete TCP Connection Control Block.
This is one of many examples where defining a low-level API
that is truly transport neutral is challenging. There are many
places where things are truly common (Protection Domain IDs,
the fact that a QP references send and receive CQs, setting
send queue and receive queue sizes, SRQs, CQs, etc.). But then
there is a substantial slice where QP States, QP Attributes,
RNIC Attributes, etc. do not align and you simply have to
have a union to split between the transport specific fields
(and label certain enum values as being transport specific).
>
> --- Venkata Jagana <jagana at us.ibm.com> wrote:
> >
> >
> >
> >
> >
> >
> > rdma-developers-admin at lists.sourceforge.net wrote on
> > 05/25/2005 09:47:00
> > PM:
> >
> > > Venkata,
> > > Interesting coincidence: I was talking with
> > someone (at HP) today
> > > who knows substantially more than I do about
> > RNICs.
> > > They indicated RNICs need to manage TCP state on
> > the card from userspace.
> > > I suspect that's only possible through a private
> > interface
> > > (e.g. ioctl() or /proc) or the non-existant (in
> > kernel.org)
> > > TOE implementation. Is this correct?
> > >
> >
> > Not correct.
> >
> > Since RNICs are offloaded adapters with RDMA protocols
> layered on top
> > of TCP stack, they do maintain the TCP state internally but it does
> > not expose to the host. RNIC expose only RNIC Verbs
> interface to the
> > host bot not TOE interface.
> >
> > Thanks
> > Venkat
> >
> > >
> > > hth,
> > > grant
> > >
> > >
> > >
> >
> -------------------------------------------------------
> > > SF.Net email is sponsored by: GoToMeeting - the
> > easiest way to
> > collaborate
> > > online with coworkers and clients while avoiding
> > the high cost of travel
> > and
> > > communications. There is no equipment to buy and
> > you can meet as often as
> > > you want. Try it
> >
> free.http://ads.osdn.com/?ad_id=7402&alloc_id=16135&op=click
> > > _______________________________________________
> > > Rdma-developers mailing list
> > > Rdma-developers at lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/rdma-developers
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection
> around http://mail.yahoo.com
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by Yahoo.
> Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
> Search APIs Find out how you can build Yahoo! directly into your own
> Applications - visit
> http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
> _______________________________________________
> Rdma-developers mailing list
> Rdma-developers at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdma-developers
>
More information about the general
mailing list