[ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
Steve Wise
swise at opengridcomputing.com
Thu Aug 2 07:21:25 PDT 2007
Sean Hefty wrote:
>> Consider NFS and NFS-RDMA. The NFS gurus struggled with this very
>> issue and concluded that the RDMA service needs to be on a separate
>> port. Thus they are proposing a new netid/port number for doing RDMA
>> mounts vs TCP/UDP mounts. IMO that is the correct way to go: RDMA
>> services are different that tcp services. They use a different
>> protocol on top of TCP and thus shouldn't be handled on the same TCP
>> port. So, applications that want to service Sockets and RDMA services
>> concurrently would do so by listening on different ports...
>
> This is a good point, and a different view from what I've been taking. I
> was looking at it more like trying to provide the same service over UDP
> and TCP, where you use the same port number. I just can't come up with
> any solution that works for iWarp, and sharing the port space seems like
> the only way to fix things.
>
>> The iWARP protocols don't include a UDP based service, so it is not
>> needed. But if you're calling it a UDP port space, maybe it should be
>> the host's port space?
>
> I think it should match what's done for TCP. IMO, there should be a
> connectionless RDMA service, along with multicast, over
> UDP/IP/Ethernet. :)
>
I think the winner would really be a reliable connectionless RDMA
service with mcast.
>> Yes. The only exports interfaces into the host port allocation stuff
>> requires a socket struct. I didn't want to try and tackle exporting
>> the port allocation services at a lower level. Even at the bottom
>> level, I think it still assumes a socket struct...
>
> I looked at this too at one point, and gave up as well. I don't know
> what other assumptions are made in the stack as a result of this. For
> example, if an app binds to an IP and port, and the IP address is
> removed and re-added, is the port still valid/reserved?
>
I just tried this and I believe the application is still listening/bound
even though the address is no longer valid for the host:
[root at vic10 ~]# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:E0:81:33:67:D1
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:29
[root at vic10 ~]# netserver -L 192.168.69.135 -p 2222 -4
Starting netserver at port 2222
set_up_server could not establish a listen endpoint for port 2222 with
family AF_INET
[root at vic10 ~]# ifconfig eth1 192.168.69.135 up
[root at vic10 ~]# netserver -L 192.168.69.135 -p 2222 -4
Starting netserver at port 2222
Starting netserver at hostname 192.168.69.135 port 2222 and family AF_INET
[root at vic10 ~]# netstat -an|grep 2222
tcp 0 0 192.168.69.135:2222 0.0.0.0:*
LISTEN
[root at vic10 ~]# ifconfig eth1 0.0.0.0
[root at vic10 ~]# netstat -an|grep 2222
tcp 0 0 192.168.69.135:2222 0.0.0.0:*
LISTEN
[root at vic10 ~]# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:E0:81:33:67:D1
inet6 addr: fe80::2e0:81ff:fe33:67d1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:176 (176.0 b)
Interrupt:29
[root at vic10 ~]#
> For iWarp, is using a struct socket essentially any different than
> transitioning an existing socket to RDMA mode?
In the RFC patch I posted, the socket is _just_ to allow binding to a
port/addr. Its not used for anything else. From the native stack's
perspective, its a TCP socket in the CLOSED state (but bound) I guess.
> You're just requiring it
> to be in a specific state. Are there problems around doing this? How
> much harder (technically, as opposed to politically) would it be to take
> this change a step farther and offload an active connection?
By active, do you mean in the ESTABLISHED state?
>
>> I left it all in to show the minimal changes needed to implement the
>> functionality. To keep the patch simple for initial consumption. But
>> yes, the rdma-cm really doesn't need to track the port stuff for TCP
>> since the host stack does.
>
> Okay - for final patches, I think we want to remove the rdma_cm specific
> port spaces, along with changing the API to clarify that it uses the
> same port space as TCP/UDP.
What do you mean by changing the API? Adding a new port space enum?
>
>> I haven't looked in detail at the SDP code, but I would think it
>> should want the TCP port space and not its own anwyay, but I'm not
>> sure. What is the point of the SDP port space anyway?
>
> The rdma_cm needs to adjust its protocol for SDP over IB. I'm not too
> concerned with SDP, since it's not upstream yet, but I don't want to
> break it beyond repair either. The rdma_cm may not need to manage the
> SDP port space at all, and instead rely on SDP to ensure that it
> provides unique port numbers by itself.
>
> - Sean
More information about the general
mailing list