[openib-general] Re: [PATCH] CMA and iWARP

Thu Jan 26 09:16:10 PST 2006

openib-general-bounces at openib.org wrote:
> Here is a comment on the specific CMA/IWARP patch:
> 
> The iwarp enhancements in the this patch save the each
> device's node_guid in the associated cma_device.  The
> assumption was that the iwarp device's node_guid would be the
> mac address for that device.
> Then, in cma_acquire_iw_dev(), the rdma_dev_addr pulled from
> the netdev device as a result of route lookup is used to find
> a cma_dev who's node_guid matches the rdma_dev_addr pulled
> from the netdev.
> 
> In ethernet terms, the netdev's dev_addr is used to find an
> appropriate cma device with a matching node_guid.
> 
> This is broken, however, for multi-ported devices (and for
> devices who have multiple mac addrs per port), since there
> isn't a concept of a port
> guid in IB (i assume, since the code doesn't have port guids).   I
> discussed this with tom, and we think the correct solution is
> for the device to promote mac addresses as gids.  Then for
> each port, the iwarp device will advertise its mac
> address(es) and populate the gid cache with these mac addresses.
> 
> Then we can change cma_acquire_iw_dev() to find the
> appropriate gid from the gid cache.  In fact,
> cma_acquire_dev() might not need to switch out to IB vs RNIC
> functions.  It can probably be mostly done with common code.
> 
> Thoughts?
> 
> I can provide a patch for this soon, but I'd rather get the
> current CMA changes into the trunk, then post a delta patch
> from the trunk...
> 

By definition iWARP is cleanly layered over IP. Therefore an
iWARP port is not a physical port but a logical one.

Management of physical ports is something that must be done
independently of RDMA software.

For example, if two physical Ethernet ports are teamed this
is NOT visible to the RDMA layer.

This is a major example of the need to let each transport
express itself naturally, and finding the common ground that
is meaningful to applications, rather than forcing one to
emulate the other.

By delegating physical port selection to the IP layer,
iWARP inherits existig Ethernet port failover solutions
and even full teaming. While not as general as InfiniBand
Path Migration, it has the benefit of working without
being exposed to the application layer.

There is no way to make the two fabrics look identical
to applications that need to be fabric aware. Fortunately
most applications just want to connect to X and don't 
care much about the fabric as long as the connection
works -- and most of the application logic is for the
phase when the connection is functional.

Applications that need to deal with fabric failures
will probably need to have transport dependent conditionals.
I don't think you can abstract the different fabric 
configuration paradigms into something that can actually
be used to diagnose or fix a problem.

What we can do is provide abstractions that allow applications
to *use* a working post-discovery fabric in a transport
neutral way.

The 'port' is not terribly important for that.
We can make the meaning nebulous, or we can enumerate
what it means for each transport. But it needs to be
clear that it is NOT a physical Ethernet port.