[ewg] OFED-1.5.1 failure over iWarp

Steve Wise swise at opengridcomputing.com
Wed Feb 3 07:17:57 PST 2010


This patch didn't work.  I still get an address resolution error event 
with status -2.

Steve.



Eli Cohen wrote:
> On Tue, Jan 19, 2010 at 04:42:16PM -0800, Woodruff, Robert J wrote:
>   
>> I am getting the following error when trying to run Intel MPI 
>> over nes iwarp cards on today's daily build of OFED-1.5.1.
>> OFED-1.5 does not show this problem. 
>>
>> mpdtrace
>> det-17-eth2
>> det-16-eth2
>> [0] dapl fabric is not available and fallback fabric is not enabled
>> det-17:cd2:  open_hca: rdma_bind ERR No such file or directory. Is eth2 configured?
>>     
>
> All,
>
> Since I do not have iwarp cards, I can't check the following patch.
> Please try it and let me know if it solved your problem. If it does,
> I'll push it to tomorrow's build.
>
>
> commit 7490e1cce1a295219e23e90d09f78bcdba0977dd
> Author: Eli Cohen <eli at mellanox.co.il>
> Date:   Wed Feb 3 13:10:14 2010 +0200
>
>     CMA: Fix iWarp failures to bind to a device
>     
>     rdma_addr_get_sgid() relies on dev_addr->transport to retrieve the correct GID
>     based on the hardware address. However, when called from cma_acquire_dev(), the
>     transport field is not yet valid. The solution is to avoid calling
>     rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on it's
>     GID: for ethernet, assume first it is rocee and search the GID table, if not
>     found generate the GID by copying it from the hardware address.
>     
>     Signed-off-by: Eli Cohen <eli at mellanox.co.il>
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index a2d5aad..76dce2b 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
>  	union ib_gid gid;
>  	int ret = -ENODEV;
>  
> -	rdma_addr_get_sgid(dev_addr, &gid);
> -	list_for_each_entry(cma_dev, &dev_list, list) {
> -		ret = ib_find_cached_gid(cma_dev->device, &gid,
> -					 &id_priv->id.port_num, NULL);
> -		if (!ret) {
> -			cma_attach_to_dev(id_priv, cma_dev);
> -			break;
> +	if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
> +		rocee_addr_get_sgid(dev_addr, &gid);
> +		list_for_each_entry(cma_dev, &dev_list, list) {
> +			ret = ib_find_cached_gid(cma_dev->device, &gid,
> +						 &id_priv->id.port_num, NULL);
> +			if (!ret)
> +				break;
> +		}
> +	} else {
> +		memcpy(&gid, dev_addr->src_dev_addr +
> +		       rdma_addr_gid_offset(dev_addr), sizeof gid);
> +		list_for_each_entry(cma_dev, &dev_list, list) {
> +			ret = ib_find_cached_gid(cma_dev->device, &gid,
> +						 &id_priv->id.port_num, NULL);
> +			if (!ret)
> +				break;
>  		}
>  	}
> +
> +	if (!ret)
> +		cma_attach_to_dev(id_priv, cma_dev);
> +
>  	return ret;
>  }
>  
>   




More information about the ewg mailing list