[ewg] bug 1918 - openmpi broken due to rdma-cm changes

Sean Hefty sean.hefty at intel.com
Fri Feb 5 12:09:54 PST 2010


>Ammasso and Chelsio T3 rnics do not support HW loopback.

It looks like the NES driver doesn't support 127.0.0.1, but does support
loopback connections (gurgle).  Here's an untested patch for 2.6.33
(not even compile tested) for consideration then.  I'll be testing
this shortly unless there's disagreement.


rdma/cm: disallow loopback address for iwarp devices

From: Sean Hefty <sean.hefty at intel.com>

The current RDMA iWarp devices cannot be used to establish
connections using the loopback address.  Prevent rdma_bind_addr
from associating the loopback address with an iWarp device.

This fixes an issue with openmpi, where it tries to identify which
IP addresses map to RDMA devices by calling rdma_bind_addr on
each address and seeing if the bind succeeds.  Prior to patch
6f8372b6 "RDMA/cm: fix loopback address support", this process
worked.  But the rdma_cm now allows rdma_bind_addr to bind to an
RDMA device using the loopback address, and attaches the rdma_cm_id
to the RDMA device as part of the bind.

Signed-off-by: Sean Hefty <sean.hefty at intel.com>
---

 drivers/infiniband/core/cma.c |   14 ++++++++++----
 1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index cc9b594..5850411 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1739,6 +1739,9 @@ err:
 }
 EXPORT_SYMBOL(rdma_resolve_route);
 
+/*
+ * Only IB devices support loopback connections.
+ */
 static int cma_bind_loopback(struct rdma_id_private *id_priv)
 {
 	struct cma_device *cma_dev;
@@ -1753,11 +1756,16 @@ static int cma_bind_loopback(struct rdma_id_private *id_priv)
 		ret = -ENODEV;
 		goto out;
 	}
-	list_for_each_entry(cma_dev, &dev_list, list)
+	list_for_each_entry(cma_dev, &dev_list, list) {
+		if (rdma_node_get_transport(cma_dev->device->node_type) !=
+		    RDMA_TRANSPORT_IB)
+			continue;
+
 		for (p = 1; p <= cma_dev->device->phys_port_cnt; ++p)
 			if (!ib_query_port(cma_dev->device, p, &port_attr) &&
 			    port_attr.state == IB_PORT_ACTIVE)
 				goto port_found;
+	}
 
 	p = 1;
 	cma_dev = list_entry(dev_list.next, struct cma_device, list);
@@ -1771,9 +1779,7 @@ port_found:
 	if (ret)
 		goto out;
 
-	id_priv->id.route.addr.dev_addr.dev_type =
-		(rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB) ?
-		ARPHRD_INFINIBAND : ARPHRD_ETHER;
+	id_priv->id.route.addr.dev_addr.dev_type = ARPHRD_INFINIBAND;
 
 	rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid);
 	ib_addr_set_pkey(&id_priv->id.route.addr.dev_addr, pkey);






More information about the ewg mailing list