[ewg] OFED-1.5.1 failure over iWarp

Steve Wise swise at opengridcomputing.com
Thu Feb 4 07:32:53 PST 2010


Hey Eli,

This patch doesn't apply.

If you give me one that applies and builds against RH5.3, I'll test it.

Thanks,

Steve.


Eli Cohen wrote:
> Oops, you're right.
>
> Please try this one:
>
> commit 483fe703b03b1db99fa4a968fc3a918aa43f856f
> Author: Eli Cohen <eli at mellanox.co.il>
> Date:   Wed Feb 3 13:10:14 2010 +0200
>
>     CMA: Fix iWarp failures to bind to a device
>     
>     rdma_addr_get_sgid() relies on dev_addr->transport to retrieve the correct GID
>     based on the hardware address. However, when called from cma_acquire_dev(), the
>     transport field is not yet valid. The solution is to avoid calling
>     rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on it's
>     GID: for ethernet, assume first it is rocee and search the GID table, if not
>     found generate the GID by copying it from the hardware address.
>     
>     Signed-off-by: Eli Cohen <eli at mellanox.co.il>
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index a2d5aad..3c5c59f 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -348,15 +348,29 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
>  	union ib_gid gid;
>  	int ret = -ENODEV;
>  
> -	rdma_addr_get_sgid(dev_addr, &gid);
> +	if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
> +		rocee_addr_get_sgid(dev_addr, &gid);
> +		list_for_each_entry(cma_dev, &dev_list, list) {
> +			ret = ib_find_cached_gid(cma_dev->device, &gid,
> +						 &id_priv->id.port_num, NULL);
> +			if (!ret)
> +				goto out;
> +		}
> +	}
> +
> +	memcpy(&gid, dev_addr->src_dev_addr +
> +	       rdma_addr_gid_offset(dev_addr), sizeof gid);
>  	list_for_each_entry(cma_dev, &dev_list, list) {
>  		ret = ib_find_cached_gid(cma_dev->device, &gid,
>  					 &id_priv->id.port_num, NULL);
> -		if (!ret) {
> -			cma_attach_to_dev(id_priv, cma_dev);
> +		if (!ret)
>  			break;
> -		}
>  	}
> +
> +out:
> +	if (!ret)
> +		cma_attach_to_dev(id_priv, cma_dev);
> +
>  	return ret;
>  }
>  
>
>   
>>>>               memcpy(&gid, dev_addr->src_dev_addr +
>>>>                      rdma_addr_gid_offset(dev_addr), sizeof gid);
>>>>               list_for_each_entry(cma_dev, &dev_list, list) {
>>>>                       ret = ib_find_cached_gid(cma_dev->device, &gid,
>>>>                                                &id_priv->id.port_num,
>>>> NULL);
>>>>                       if (!ret)
>>>>                               break;
>>>>               }
>>>>       }
>>>>
>>>>       if (!ret)
>>>>               cma_attach_to_dev(id_priv, cma_dev);
>>>>
>>>>       return ret;
>>>> }
>>>> ----------------
>>>>
>>>>
>>>>
>>>> Eli Cohen wrote:
>>>>         
>>>>> On Wed, Feb 03, 2010 at 09:20:05AM -0600, Steve Wise wrote:
>>>>>           
>>>>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>>>>>> index a2d5aad..76dce2b 100644
>>>>>>> --- a/drivers/infiniband/core/cma.c
>>>>>>> +++ b/drivers/infiniband/core/cma.c
>>>>>>> @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
>>>>>>> 	union ib_gid gid;
>>>>>>> 	int ret = -ENODEV;
>>>>>>> -	rdma_addr_get_sgid(dev_addr, &gid);
>>>>>>> -	list_for_each_entry(cma_dev, &dev_list, list) {
>>>>>>> -		ret = ib_find_cached_gid(cma_dev->device, &gid,
>>>>>>> -					 &id_priv->id.port_num, NULL);
>>>>>>> -		if (!ret) {
>>>>>>> -			cma_attach_to_dev(id_priv, cma_dev);
>>>>>>> -			break;
>>>>>>> +	if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
>>>>>>> +		rocee_addr_get_sgid(dev_addr, &gid);
>>>>>>> +		list_for_each_entry(cma_dev, &dev_list, list) {
>>>>>>> +			ret = ib_find_cached_gid(cma_dev->device, &gid,
>>>>>>> +						 &id_priv->id.port_num, NULL);
>>>>>>> +			if (!ret)
>>>>>>> +				break;
>>>>>>> +		}
>>>>>>>               
>>>>>> The above if statement is true for iwarp devices, so this patch is
>>>>>> just wrong.   rocee__addr_get_sgid() should only be used for ROCEE
>>>>>> interfaces, correct?
>>>>>>             
>>>>> No, the idea is this: for non ARPHRD_INFINIBAND devices (e.g. rocee or
>>>>> iwarp) I assume first this rocee, get the rocee gid, and check if this
>>>>> gid appears in any device's gid table. It the mac address belongs to a
>>>>> rocee device then it will be found; if it belongs to an iwarp device
>>>>> then it won't be found. In the later case I build the gid in the pre
>>>>> rocee patches fashion and search again.
>>>>>           
>>>>>>> +	} else {
>>>>>>> +		memcpy(&gid, dev_addr->src_dev_addr +
>>>>>>> +		       rdma_addr_gid_offset(dev_addr), sizeof gid);
>>>>>>> +		list_for_each_entry(cma_dev, &dev_list, list) {
>>>>>>> +			ret = ib_find_cached_gid(cma_dev->device, &gid,
>>>>>>> +						 &id_priv->id.port_num, NULL);
>>>>>>> +			if (!ret)
>>>>>>> +				break;
>>>>>>> 		}
>>>>>>> 	}
>>>>>>> +
>>>>>>> +	if (!ret)
>>>>>>> +		cma_attach_to_dev(id_priv, cma_dev);
>>>>>>> +
>>>>>>> 	return ret;
>>>>>>> }
>>>>>>>               




More information about the ewg mailing list