[openib-general] RDMA CM and loopback addresses

Wed Mar 29 20:53:18 PST 2006

On Wed, Mar 29, 2006 at 04:50:27PM -0800, Sean Hefty wrote:

> To obtain similar behavior with the RDMA CM, I propose the following:
> 
> 1. Binding to the loopback address will no longer result in acquiring a local
> RDMA device.  (This will be deferred to rdma_resolve_addr().)
> 2. Listening on a loopback address will result in listening across all RDMA
> devices.
> 3. Connections from a loopback address will acquire a device based on the
> destination address.  If the destination address is also a loopback address, the
> CMA will simply pick the first one in the list.

In IP land an IP address is a property of a host, not of an
interface. What is happening in your experiment is that the routing
entries are making all your packets go over the lo interface and
bypass all the packet filtering that might otherwise drop the strange
source/dest address pairs.

It can often appears that addresses are associated with an interface
in some way. This happens because Linux's rp_filter is usually on by
default and rp_filter drops packets that don't conform to the 'an IP
is part of the interface' view. That is why a remote user cannot
access a 127.0.0.1 bound socket using hacked packets.

If you disable all packet filtering and you have two hosts
[10.0.0.1 and 10.0.0.2] doing the following on .2:

ip route add 127.0.0.1 via 10.0.0.1 dev eth0
telnet -b 10.0.0.2 127.0.0.1

And it will connect to .1's server. Turn rp_filter back on and it will
stop working again.

I think any admin will expect the same kind of behavior from anything
claiming to use IP addressing, and I'd propose something like the
above as the acid test for the RDMA CM. Thus..

>From an admin perspective it seems to me the proper thing for CMA to
do for outgoing would be:
- Connections with a 0 source use the destination info to consult
  the routing table to assign the correct source address
- Then consult the route table with a full tuple
  <src,dst,sport,dport,tclass,etc> to determine what device to send
  out on
Basically I'd expect all the advanced routing features in linux,
including policy routing, to work properly for RMDA connections.

For incoming I'd expect:
- Incoming connections *optimally* would include the src socket
  information so that that various policy mechanisms will work.
- Then the src/dst should run through the policy stuff to see if the
  connection request should be dropped
- Finally a full tuple route table lookup is done to ensure that a
  outgoing route exists with the proper outgoing device. (Even if it
  isn't the shortest prefix or lowest metric route)

If the above isn't possible then I'd strongly suggest that two modes
be supported. One where the'common' rp_filter-esque semantics be
applied, which is that a bind to a particular IP results in
connections only working from the device that has the matching
IP. Only the wildcard address means all devices. The other is that all
binds accept connections from any interface, which matches the
disabled rp_filter mode. This is a really bad emulation of the normal
way things work, so if possible the src address should be available
to the server when setting up the connection.

Regards,
Jason