<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.2627" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff
size=2></FONT> </DIV><BR>
<BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B>
rdma-developers-admin@lists.sourceforge.net
[mailto:rdma-developers-admin@lists.sourceforge.net] <B>On Behalf Of
</B>Michael Krause<BR><B>Sent:</B> Friday, May 27, 2005 9:51 AM<BR><B>To:</B>
Bernard Metzler; Sukanta ganguly<BR><B>Cc:</B>
rdma-developers@lists.sourceforge.net; openib-general@openib.org;
rdma-developers-admin@lists.sourceforge.net<BR><B>Subject:</B> Re:
[Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on
common RDMA APIs and ULPs for Linux<BR></FONT><BR></DIV>
<DIV></DIV><FONT size=3>At 08:05 AM 5/27/2005, Bernard Metzler
wrote:<BR><BR></FONT>
<BLOCKQUOTE class=cite cite="" type="cite"><FONT size=2>Sukanta,</FONT><FONT
size=3> <BR><BR></FONT><FONT size=2>without touching any TOE issues (this
question is about RDMA, right?), after</FONT><FONT size=3> <BR></FONT><FONT
size=2>transforming a TCP connection into RDMA mode and using an RDMA
API,</FONT><FONT size=3> <BR></FONT><FONT size=2>the socket file descriptor
is not longer to be used for communication.</FONT><FONT size=3>
</FONT></BLOCKQUOTE>
<DIV><BR>Whether one uses a Socket FD or not is an application issue. A
QP has a handle identifier which may be mapped to a FD or may be used directly
by an application. There is no requirement that anything flow through
the Sockets API - that is a choice for the application to
make. <SPAN class=785322017-27052005><FONT face=Arial color=#0000ff
size=2> </FONT></SPAN></DIV>
<DIV><SPAN class=785322017-27052005></SPAN> </DIV></BLOCKQUOTE>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>True.</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005></SPAN></FONT></FONT></FONT> </DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>The current RNIC-PI spec is deliberately vague
about what an "LLP Handle" is, other than that it must include a Socket
FD.</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>It may be possible for it to include some sort of
"connection request handle" for a pending connection on the offload
device,</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>but you would have to ensure that this "connection
requestion handle" was reviewed by the Host IP stack so that it
could</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>forbid connections that contradict the host's IP
firewall policies. </SPAN><BR><BR></DIV></FONT></FONT></FONT>
<BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<BLOCKQUOTE class=cite cite="" type="cite"><FONT size=2>In fact, on some
implementations the stream socket resources will</FONT><FONT size=3>
</FONT><FONT size=2>even get released. That is, if the socket was the direct
consumer</FONT><FONT size=3> </FONT><FONT size=2>of the TCP stream, then now
it is RDMAP/DDP/MPA.</FONT><FONT size=3> </FONT><FONT size=2>RDMA APIs such
as IT-API defining a specific call to convert a socket based</FONT><FONT
size=3> </FONT><FONT size=2>connection into RDMA mode (e.g.,
it_socket_convert()).</FONT><FONT size=3> </FONT><FONT size=2>Other
consumers may directly start via an RDMA API, never</FONT><FONT size=3>
<BR></FONT><FONT size=2>opening a consumer controlled socket.</FONT><FONT
size=3> </FONT></BLOCKQUOTE>
<DIV><BR>Correct.<SPAN class=785322017-27052005><FONT face=Arial color=#0000ff
size=2> </FONT></SPAN></DIV>
<DIV><SPAN class=785322017-27052005></SPAN> </DIV></BLOCKQUOTE>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>There may never be a *consumer* controlled socket,
but there will be a *host* controlled socket
used</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>by the Connection Manager and subject to normal host
firewall controls.</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005></SPAN></FONT></FONT></FONT> </DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>It is in fact highly desirable from both a security and
performance basis for the streaming mode TCP</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>connection to never be exposed to the
application. Only specialized applications, such as
iSER,</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>require this
capability. </SPAN><BR><BR></DIV></FONT></FONT></FONT>
<BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<BLOCKQUOTE class=cite cite="" type="cite"><FONT size=2>So, in RDMA mode,
communication will happen via the RDMA API. At this</FONT><FONT size=3>
</FONT><FONT size=2>stage, the kernel still have to keep completely in its
hands the</FONT><FONT size=3> </FONT><FONT size=2>synchronisation of state
information related to that offloaded</FONT><FONT size=3> </FONT><FONT
size=2>connection(s) with the host stack (it would have to protect
the</FONT><FONT size=3> </FONT><FONT size=2>local port used by the offloaded
connection for example, others</FONT><FONT size=3> </FONT><FONT size=2>are
routing, ARP, SNMP...), but it is not involved at the data path.</FONT><FONT
size=3> </FONT></BLOCKQUOTE>
<DIV><BR>This is true only when the host and the RNIC share a common IP
address. If they are separate subnets, then there is no need to
coordinate sans some of the reasons I noted in an earlier e-mail.
I think there is great value and customer need to have an infrastructure that
can coordinate information and enable customers to use the existing tool
chains to understand what is going on in a given device or endnode.<BR><SPAN
class=785322017-27052005><FONT face=Arial color=#0000ff
size=2> </FONT></SPAN></DIV></BLOCKQUOTE>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>Good point. The key is that the RNIC needs to
do co-ordinate all of these with the IP stack that owns the
IP address.</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>If it is that owner, it must co-ordinate with itself.
(Actually that's a clause that needs to be made more explicit in
the</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>RNIC-PI spec, which was focused on the shared IP
scenario). But when the RNIC is "sub-leasing" the IP it
needs</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>to co-ordinate its action with the owner of the IP
address. </SPAN><BR></DIV></FONT></FONT></FONT>
<BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<BLOCKQUOTE class=cite cite="" type="cite"><FONT size=2>With respect to the
kernel based TCP stack, what is not needed is</FONT><FONT size=3>
</FONT><FONT size=2>a hack into the stack and scatter/gather state
information of the live</FONT><FONT size=3> </FONT><FONT size=2>TCP
connection between kernel and RNIC, but to find one clean
interface</FONT><FONT size=3> </FONT><FONT size=2>to transfer state
information out of that stack and to the RNIC.</FONT><FONT size=3>
</FONT></BLOCKQUOTE>
<DIV><BR>At a minimum, one needs:<BR><BR>- A host network stack get state call
to extract connection state and quiesce the port from the host's
perspective.<BR>- A host network stack set state call to populate the host
structures with the associated state and enable the port<BR>- A RNIC get state
call to extract all transport and RMDA context<BR>- A RNIC set state call to
populate all transport and RDMA context<BR><SPAN
class=785322017-27052005><FONT face=Arial color=#0000ff
size=2> </FONT></SPAN></DIV></BLOCKQUOTE>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>An RNIC "get state" is nice, but not
essential. It enables transfer of an RDMA state to another RNIC
for</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>failover purposes. But in terms of minimal control, the
ability to kill an RDMA controlled connection
is</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>sufficient.</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005></SPAN></FONT></FONT></FONT> </DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>Additionally, any transfer of RDMA Context will be
limited to transfer between like models for
some</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005>time, and even then it is not likely to be a
standard feature.</SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT face=Arial><FONT size=2><FONT color=#0000ff><SPAN
class=785322017-27052005> </SPAN><BR><BR></FONT></FONT></FONT><FONT size=3>
</FONT></DIV></BODY></HTML>