[ewg] OFED-1.5.1 failure over iWarp

Eli Cohen eli at dev.mellanox.co.il
Wed Feb 3 04:47:33 PST 2010


On Tue, Jan 19, 2010 at 04:42:16PM -0800, Woodruff, Robert J wrote:
> I am getting the following error when trying to run Intel MPI 
> over nes iwarp cards on today's daily build of OFED-1.5.1.
> OFED-1.5 does not show this problem. 
> 
> mpdtrace
> det-17-eth2
> det-16-eth2
> [0] dapl fabric is not available and fallback fabric is not enabled
> det-17:cd2:  open_hca: rdma_bind ERR No such file or directory. Is eth2 configured?

All,

Since I do not have iwarp cards, I can't check the following patch.
Please try it and let me know if it solved your problem. If it does,
I'll push it to tomorrow's build.


commit 7490e1cce1a295219e23e90d09f78bcdba0977dd
Author: Eli Cohen <eli at mellanox.co.il>
Date:   Wed Feb 3 13:10:14 2010 +0200

    CMA: Fix iWarp failures to bind to a device
    
    rdma_addr_get_sgid() relies on dev_addr->transport to retrieve the correct GID
    based on the hardware address. However, when called from cma_acquire_dev(), the
    transport field is not yet valid. The solution is to avoid calling
    rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on it's
    GID: for ethernet, assume first it is rocee and search the GID table, if not
    found generate the GID by copying it from the hardware address.
    
    Signed-off-by: Eli Cohen <eli at mellanox.co.il>

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index a2d5aad..76dce2b 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
 	union ib_gid gid;
 	int ret = -ENODEV;
 
-	rdma_addr_get_sgid(dev_addr, &gid);
-	list_for_each_entry(cma_dev, &dev_list, list) {
-		ret = ib_find_cached_gid(cma_dev->device, &gid,
-					 &id_priv->id.port_num, NULL);
-		if (!ret) {
-			cma_attach_to_dev(id_priv, cma_dev);
-			break;
+	if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
+		rocee_addr_get_sgid(dev_addr, &gid);
+		list_for_each_entry(cma_dev, &dev_list, list) {
+			ret = ib_find_cached_gid(cma_dev->device, &gid,
+						 &id_priv->id.port_num, NULL);
+			if (!ret)
+				break;
+		}
+	} else {
+		memcpy(&gid, dev_addr->src_dev_addr +
+		       rdma_addr_gid_offset(dev_addr), sizeof gid);
+		list_for_each_entry(cma_dev, &dev_list, list) {
+			ret = ib_find_cached_gid(cma_dev->device, &gid,
+						 &id_priv->id.port_num, NULL);
+			if (!ret)
+				break;
 		}
 	}
+
+	if (!ret)
+		cma_attach_to_dev(id_priv, cma_dev);
+
 	return ret;
 }
 



More information about the ewg mailing list