[Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMA APIs and ULPs for Linux

Fri May 27 10:30:13 PDT 2005

________________________________

	From: rdma-developers-admin at lists.sourceforge.net
[mailto:rdma-developers-admin at lists.sourceforge.net] On Behalf Of
Michael Krause
	Sent: Friday, May 27, 2005 9:51 AM
	To: Bernard Metzler; Sukanta ganguly
	Cc: rdma-developers at lists.sourceforge.net;
openib-general at openib.org; rdma-developers-admin at lists.sourceforge.net
	Subject: Re: [Rdma-developers] Re: [openib-general] OpenIB and
OpenRDMA: Convergence on common RDMA APIs and ULPs for Linux

	At 08:05 AM 5/27/2005, Bernard Metzler wrote:

		Sukanta, 

		without touching any TOE issues (this question is about
RDMA, right?), after 
		transforming a TCP connection into RDMA mode and using
an RDMA API, 
		the socket file descriptor is not longer to be used for
communication. 

	Whether one uses a Socket FD or not is an application issue.  A
QP has a handle identifier which may be mapped to a FD or may be used
directly by an application.  There is no requirement that anything flow
through the Sockets API - that is a choice for the application to make.

True.

The current RNIC-PI spec is deliberately vague about what an "LLP
Handle" is, other than that it must include a Socket FD.
It may be possible for it to include some sort of "connection request
handle" for a pending connection on the offload device,
but you would have to ensure that this "connection requestion handle"
was reviewed by the Host IP stack so that it could
forbid connections that contradict the host's IP firewall policies. 

		In fact, on some implementations the stream socket
resources will even get released. That is, if the socket was the direct
consumer of the TCP stream, then now it is RDMAP/DDP/MPA. RDMA APIs such
as IT-API defining a specific call to convert a socket based connection
into RDMA mode (e.g., it_socket_convert()). Other consumers may directly
start via an RDMA API, never 
		opening a consumer controlled socket. 

	Correct. 

There may never be a *consumer* controlled socket, but there will be a
*host* controlled socket used
by the Connection Manager and subject to normal host firewall controls.

It is in fact highly desirable from both a security and performance
basis for the streaming mode TCP
connection to never be exposed to the application. Only specialized
applications, such as iSER,
require this capability. 

		So, in RDMA mode, communication will happen via the RDMA
API. At this stage, the kernel still have to keep completely in its
hands the synchronisation of state information related to that offloaded
connection(s) with the host stack (it would have to protect the local
port used by the offloaded connection for example, others are routing,
ARP, SNMP...), but it is not involved at the data path. 

	This is true only when the host and the RNIC share a common IP
address.  If they are separate subnets, then there is no need to
coordinate sans some of the reasons I noted in an earlier e-mail.   I
think there is great value and customer need to have an infrastructure
that can coordinate information and enable customers to use the existing
tool chains to understand what is going on in a given device or endnode.

Good point. The key is that the RNIC needs to do co-ordinate all of
these with the IP stack that owns the IP address.
If it is that owner, it must co-ordinate with itself. (Actually that's a
clause that needs to be made more explicit in the
RNIC-PI spec, which was focused on the shared IP scenario). But when the
RNIC is "sub-leasing" the IP it needs
to co-ordinate its action with the owner of the IP address. 

		With respect to the kernel based TCP stack, what is not
needed is a hack into the stack and scatter/gather state information of
the live TCP connection between kernel and RNIC, but to find one clean
interface to transfer state information out of that stack and to the
RNIC. 

	At a minimum, one needs:

	- A host network stack get state call to extract connection
state and quiesce the port from the host's perspective.
	- A host network stack set state call to populate the host
structures with the associated state and enable the port
	- A RNIC get state call to extract all transport and RMDA
context
	- A RNIC set state call to populate all transport and RDMA
context

An RNIC "get state" is nice, but not essential. It enables transfer of
an RDMA state to another RNIC for
failover purposes. But in terms of minimal control, the ability to kill
an RDMA controlled connection is
sufficient.

Additionally, any transfer of RDMA Context will be limited to transfer
between like models for some
time, and even then it is not likely to be a standard feature.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050527/efc487b3/attachment.html>