[openib-general] RDMA CM and loopback addresses

Thu Mar 30 13:38:04 PST 2006

openib-general-bounces at openib.org wrote:
> Jason Gunthorpe wrote:
>> Well, this is what happens in the normal IP stack. To match normal IP
>> semantics the source IP alone should never be mapped to a device, the
>> full tuple should be passed through the route table to get to a
>> device. This is what I was saying before, the IP is a property of the
>> host, not of a device.
> 
> It will be difficult, if not impossible, to fully match IP
> semantics.  Before a connection can occur, hardware resources
> need to be allocated, which requires a specific device.  So,
> for the purpose of RDMA, we may need to treat an address as a
> property of a device, rather than the host.
> 
> Currently, rdma_bind_addr(id, source IP) may associate a
> cm_id with a specific hardware device, so the user can
> allocate QPs, CQs, etc.  Listen or connection requests are
> then restricted to that specific hardware device.  I.e.
> connections requests that come over an IB device, are restricted to
> that IB device. 
> 
> - Sean

Agreed. It applies to all RDMA devices for exactly the same reasons
cited: the need to pre-allocate MRs, CQs, PDs and other objects
that will be associated with the established connections.

The idea that the local IP address dictates the egress port
is not really divergent from the IP semantics enforced on
most systems.

An IP address *is* associated with a single *subnet*. And
most firewalls are configured to enforce that packets
originating on subnet X have a matching source address.

Having two different *devices* reach the *same* subnet
is an unusual strategy for network reliability. If you
are only protecting against port or cable failures then
a multi-port NIC is far more cost and space effective.
If you really need to be paranoid enough to guard against
NIC failure (by having two independent NICs) then you 
probably want them connected to two different local
networks. After all a power-failure on the first hop
switch or router is far more likely than a NIC failure.
Two local networks with two local routers can have
independent power sources and hence very unlikely
to both be down at the same time. That is not true
for two NICs on the same host.

I submit that a local address is adequate to uniquely
identify a single RDMA device for virtually all hosts.
Further, hosts that actually have the same IP address
(reaching the same network) through Ethernet interfaces
that have *different* RDMA devcies will face issues that
cannot be resolved by the Connection Manager.

Merely documenting this onerous "restriction" and leaving
the interface the way it is make more sense to me.