[ewg] OFED-1.5.1 failure over iWarp

Steve Wise swise at opengridcomputing.com
Thu Feb 4 07:46:58 PST 2010


Never mind.  I see you already committed the change.  I just pulled the 
latest and rping works over iwarp.

Thanks,

Steve.


Steve Wise wrote:
> Hey Eli,
>
> This patch doesn't apply.
>
> If you give me one that applies and builds against RH5.3, I'll test it.
>
> Thanks,
>
> Steve.
>
>
> Eli Cohen wrote:
>   
>> Oops, you're right.
>>
>> Please try this one:
>>
>> commit 483fe703b03b1db99fa4a968fc3a918aa43f856f
>> Author: Eli Cohen <eli at mellanox.co.il>
>> Date:   Wed Feb 3 13:10:14 2010 +0200
>>
>>     CMA: Fix iWarp failures to bind to a device
>>     
>>     rdma_addr_get_sgid() relies on dev_addr->transport to retrieve the correct GID
>>     based on the hardware address. However, when called from cma_acquire_dev(), the
>>     transport field is not yet valid. The solution is to avoid calling
>>     rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on it's
>>     GID: for ethernet, assume first it is rocee and search the GID table, if not
>>     found generate the GID by copying it from the hardware address.
>>     
>>     Signed-off-by: Eli Cohen <eli at mellanox.co.il>
>>
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index a2d5aad..3c5c59f 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -348,15 +348,29 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
>>  	union ib_gid gid;
>>  	int ret = -ENODEV;
>>  
>> -	rdma_addr_get_sgid(dev_addr, &gid);
>> +	if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
>> +		rocee_addr_get_sgid(dev_addr, &gid);
>> +		list_for_each_entry(cma_dev, &dev_list, list) {
>> +			ret = ib_find_cached_gid(cma_dev->device, &gid,
>> +						 &id_priv->id.port_num, NULL);
>> +			if (!ret)
>> +				goto out;
>> +		}
>> +	}
>> +
>> +	memcpy(&gid, dev_addr->src_dev_addr +
>> +	       rdma_addr_gid_offset(dev_addr), sizeof gid);
>>  	list_for_each_entry(cma_dev, &dev_list, list) {
>>  		ret = ib_find_cached_gid(cma_dev->device, &gid,
>>  					 &id_priv->id.port_num, NULL);
>> -		if (!ret) {
>> -			cma_attach_to_dev(id_priv, cma_dev);
>> +		if (!ret)
>>  			break;
>> -		}
>>  	}
>> +
>> +out:
>> +	if (!ret)
>> +		cma_attach_to_dev(id_priv, cma_dev);
>> +
>>  	return ret;
>>  }
>>  
>>
>>   
>>     
>>>>>               memcpy(&gid, dev_addr->src_dev_addr +
>>>>>                      rdma_addr_gid_offset(dev_addr), sizeof gid);
>>>>>               list_for_each_entry(cma_dev, &dev_list, list) {
>>>>>                       ret = ib_find_cached_gid(cma_dev->device, &gid,
>>>>>                                                &id_priv->id.port_num,
>>>>> NULL);
>>>>>                       if (!ret)
>>>>>                               break;
>>>>>               }
>>>>>       }
>>>>>
>>>>>       if (!ret)
>>>>>               cma_attach_to_dev(id_priv, cma_dev);
>>>>>
>>>>>       return ret;
>>>>> }
>>>>> ----------------
>>>>>
>>>>>
>>>>>
>>>>> Eli Cohen wrote:
>>>>>         
>>>>>           
>>>>>> On Wed, Feb 03, 2010 at 09:20:05AM -0600, Steve Wise wrote:
>>>>>>           
>>>>>>             
>>>>>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>>>>>>> index a2d5aad..76dce2b 100644
>>>>>>>> --- a/drivers/infiniband/core/cma.c
>>>>>>>> +++ b/drivers/infiniband/core/cma.c
>>>>>>>> @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
>>>>>>>> 	union ib_gid gid;
>>>>>>>> 	int ret = -ENODEV;
>>>>>>>> -	rdma_addr_get_sgid(dev_addr, &gid);
>>>>>>>> -	list_for_each_entry(cma_dev, &dev_list, list) {
>>>>>>>> -		ret = ib_find_cached_gid(cma_dev->device, &gid,
>>>>>>>> -					 &id_priv->id.port_num, NULL);
>>>>>>>> -		if (!ret) {
>>>>>>>> -			cma_attach_to_dev(id_priv, cma_dev);
>>>>>>>> -			break;
>>>>>>>> +	if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
>>>>>>>> +		rocee_addr_get_sgid(dev_addr, &gid);
>>>>>>>> +		list_for_each_entry(cma_dev, &dev_list, list) {
>>>>>>>> +			ret = ib_find_cached_gid(cma_dev->device, &gid,
>>>>>>>> +						 &id_priv->id.port_num, NULL);
>>>>>>>> +			if (!ret)
>>>>>>>> +				break;
>>>>>>>> +		}
>>>>>>>>               
>>>>>>>>                 
>>>>>>> The above if statement is true for iwarp devices, so this patch is
>>>>>>> just wrong.   rocee__addr_get_sgid() should only be used for ROCEE
>>>>>>> interfaces, correct?
>>>>>>>             
>>>>>>>               
>>>>>> No, the idea is this: for non ARPHRD_INFINIBAND devices (e.g. rocee or
>>>>>> iwarp) I assume first this rocee, get the rocee gid, and check if this
>>>>>> gid appears in any device's gid table. It the mac address belongs to a
>>>>>> rocee device then it will be found; if it belongs to an iwarp device
>>>>>> then it won't be found. In the later case I build the gid in the pre
>>>>>> rocee patches fashion and search again.
>>>>>>           
>>>>>>             
>>>>>>>> +	} else {
>>>>>>>> +		memcpy(&gid, dev_addr->src_dev_addr +
>>>>>>>> +		       rdma_addr_gid_offset(dev_addr), sizeof gid);
>>>>>>>> +		list_for_each_entry(cma_dev, &dev_list, list) {
>>>>>>>> +			ret = ib_find_cached_gid(cma_dev->device, &gid,
>>>>>>>> +						 &id_priv->id.port_num, NULL);
>>>>>>>> +			if (!ret)
>>>>>>>> +				break;
>>>>>>>> 		}
>>>>>>>> 	}
>>>>>>>> +
>>>>>>>> +	if (!ret)
>>>>>>>> +		cma_attach_to_dev(id_priv, cma_dev);
>>>>>>>> +
>>>>>>>> 	return ret;
>>>>>>>> }
>>>>>>>>               
>>>>>>>>                 
>
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>   




More information about the ewg mailing list