[ewg] OFED-1.5.1 failure over iWarp
Steve Wise
swise at opengridcomputing.com
Wed Feb 3 07:17:57 PST 2010
This patch didn't work. I still get an address resolution error event
with status -2.
Steve.
Eli Cohen wrote:
> On Tue, Jan 19, 2010 at 04:42:16PM -0800, Woodruff, Robert J wrote:
>
>> I am getting the following error when trying to run Intel MPI
>> over nes iwarp cards on today's daily build of OFED-1.5.1.
>> OFED-1.5 does not show this problem.
>>
>> mpdtrace
>> det-17-eth2
>> det-16-eth2
>> [0] dapl fabric is not available and fallback fabric is not enabled
>> det-17:cd2: open_hca: rdma_bind ERR No such file or directory. Is eth2 configured?
>>
>
> All,
>
> Since I do not have iwarp cards, I can't check the following patch.
> Please try it and let me know if it solved your problem. If it does,
> I'll push it to tomorrow's build.
>
>
> commit 7490e1cce1a295219e23e90d09f78bcdba0977dd
> Author: Eli Cohen <eli at mellanox.co.il>
> Date: Wed Feb 3 13:10:14 2010 +0200
>
> CMA: Fix iWarp failures to bind to a device
>
> rdma_addr_get_sgid() relies on dev_addr->transport to retrieve the correct GID
> based on the hardware address. However, when called from cma_acquire_dev(), the
> transport field is not yet valid. The solution is to avoid calling
> rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on it's
> GID: for ethernet, assume first it is rocee and search the GID table, if not
> found generate the GID by copying it from the hardware address.
>
> Signed-off-by: Eli Cohen <eli at mellanox.co.il>
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index a2d5aad..76dce2b 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
> union ib_gid gid;
> int ret = -ENODEV;
>
> - rdma_addr_get_sgid(dev_addr, &gid);
> - list_for_each_entry(cma_dev, &dev_list, list) {
> - ret = ib_find_cached_gid(cma_dev->device, &gid,
> - &id_priv->id.port_num, NULL);
> - if (!ret) {
> - cma_attach_to_dev(id_priv, cma_dev);
> - break;
> + if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
> + rocee_addr_get_sgid(dev_addr, &gid);
> + list_for_each_entry(cma_dev, &dev_list, list) {
> + ret = ib_find_cached_gid(cma_dev->device, &gid,
> + &id_priv->id.port_num, NULL);
> + if (!ret)
> + break;
> + }
> + } else {
> + memcpy(&gid, dev_addr->src_dev_addr +
> + rdma_addr_gid_offset(dev_addr), sizeof gid);
> + list_for_each_entry(cma_dev, &dev_list, list) {
> + ret = ib_find_cached_gid(cma_dev->device, &gid,
> + &id_priv->id.port_num, NULL);
> + if (!ret)
> + break;
> }
> }
> +
> + if (!ret)
> + cma_attach_to_dev(id_priv, cma_dev);
> +
> return ret;
> }
>
>
More information about the ewg
mailing list