[ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.

Steve Wise swise at opengridcomputing.com
Thu Aug 2 07:21:25 PDT 2007


Sean Hefty wrote:
>> Consider NFS and NFS-RDMA.  The NFS gurus struggled with this very 
>> issue and concluded that the RDMA service needs to be on a separate 
>> port. Thus they are proposing a new netid/port number for doing RDMA 
>> mounts vs TCP/UDP mounts.  IMO that is the correct way to go:  RDMA 
>> services are different that tcp services.  They use a different 
>> protocol on top of TCP and thus shouldn't be handled on the same TCP 
>> port.  So, applications that want to service Sockets and RDMA services 
>> concurrently would do so by listening on different ports...
> 
> This is a good point, and a different view from what I've been taking. I 
> was looking at it more like trying to provide the same service over UDP 
> and TCP, where you use the same port number.  I just can't come up with 
> any solution that works for iWarp, and sharing the port space seems like 
> the only way to fix things.
> 
>> The iWARP protocols don't include a UDP based service, so it is not 
>> needed.  But if you're calling it a UDP port space, maybe it should be 
>> the host's port space?
> 
> I think it should match what's done for TCP.  IMO, there should be a 
> connectionless RDMA service, along with multicast, over 
> UDP/IP/Ethernet.  :)
> 

I think the winner would really be a reliable connectionless RDMA 
service with mcast.

>> Yes.  The only exports interfaces into the host port allocation stuff 
>> requires a socket struct.  I didn't want to try and tackle exporting 
>> the port allocation services at a lower level.  Even at the bottom 
>> level, I think it still assumes a socket struct...
> 
> I looked at this too at one point, and gave up as well.  I don't know 
> what other assumptions are made in the stack as a result of this.  For 
> example, if an app binds to an IP and port, and the IP address is 
> removed and re-added, is the port still valid/reserved?
> 

I just tried this and I believe the application is still listening/bound 
even though the address is no longer valid for the host:

[root at vic10 ~]# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:E0:81:33:67:D1
           BROADCAST MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
           Interrupt:29

[root at vic10 ~]# netserver -L 192.168.69.135 -p 2222 -4
Starting netserver at port 2222
set_up_server could not establish a listen endpoint for  port 2222 with 
family AF_INET
[root at vic10 ~]# ifconfig eth1 192.168.69.135 up
[root at vic10 ~]# netserver -L 192.168.69.135 -p 2222 -4
Starting netserver at port 2222
Starting netserver at hostname 192.168.69.135 port 2222 and family AF_INET
[root at vic10 ~]# netstat -an|grep 2222
tcp        0      0 192.168.69.135:2222         0.0.0.0:* 
     LISTEN
[root at vic10 ~]# ifconfig eth1 0.0.0.0
[root at vic10 ~]# netstat -an|grep 2222
tcp        0      0 192.168.69.135:2222         0.0.0.0:* 
     LISTEN
[root at vic10 ~]# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 00:E0:81:33:67:D1
           inet6 addr: fe80::2e0:81ff:fe33:67d1/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
           TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:0 (0.0 b)  TX bytes:176 (176.0 b)
           Interrupt:29

[root at vic10 ~]#


> For iWarp, is using a struct socket essentially any different than 
> transitioning an existing socket to RDMA mode?  

In the RFC patch I posted, the socket is _just_ to allow binding to a 
port/addr.  Its not used for anything else.  From the native stack's 
perspective, its a TCP socket in the CLOSED state (but bound) I guess.

> You're just requiring it 
> to be in a specific state.  Are there problems around doing this?  How 
> much harder (technically, as opposed to politically) would it be to take 
> this change a step farther and offload an active connection?

By active, do you mean in the ESTABLISHED state?

> 
>> I left it all in to show the minimal changes needed to implement the 
>> functionality.  To keep the patch simple for initial consumption.  But 
>> yes, the rdma-cm really doesn't need to track the port stuff for TCP 
>> since the host stack does.
> 
> Okay - for final patches, I think we want to remove the rdma_cm specific 
> port spaces, along with changing the API to clarify that it uses the 
> same port space as TCP/UDP.

What do you mean by changing the API? Adding a new port space enum?

> 
>> I haven't looked in detail at the SDP code, but I would think it 
>> should want the TCP port space and not its own anwyay, but I'm not 
>> sure.  What is the point of the SDP port space anyway?
> 
> The rdma_cm needs to adjust its protocol for SDP over IB.  I'm not too 
> concerned with SDP, since it's not upstream yet, but I don't want to 
> break it beyond repair either.  The rdma_cm may not need to manage the 
> SDP port space at all, and instead rely on SDP to ensure that it 
> provides unique port numbers by itself.
> 
> - Sean




More information about the general mailing list