[ewg] bug 1918 - openmpi broken due to rdma-cm changes
Steve Wise
swise at opengridcomputing.com
Sat Feb 6 08:31:04 PST 2010
> rdma/cm: disallow loopback address for iwarp devices
>
> From: Sean Hefty <sean.hefty at intel.com>
>
> The current RDMA iWarp devices cannot be used to establish
> connections using the loopback address. Prevent rdma_bind_addr
> from associating the loopback address with an iWarp device.
>
> This fixes an issue with openmpi, where it tries to identify which
> IP addresses map to RDMA devices by calling rdma_bind_addr on
> each address and seeing if the bind succeeds. Prior to patch
> 6f8372b6 "RDMA/cm: fix loopback address support", this process
> worked. But the rdma_cm now allows rdma_bind_addr to bind to an
> RDMA device using the loopback address, and attaches the rdma_cm_id
> to the RDMA device as part of the bind.
>
> Signed-off-by: Sean Hefty <sean.hefty at intel.com>
> ---
>
> drivers/infiniband/core/cma.c | 14 ++++++++++----
> 1 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index cc9b594..5850411 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -1739,6 +1739,9 @@ err:
> }
> EXPORT_SYMBOL(rdma_resolve_route);
>
> +/*
> + * Only IB devices support loopback connections.
> + */
> static int cma_bind_loopback(struct rdma_id_private *id_priv)
> {
> struct cma_device *cma_dev;
> @@ -1753,11 +1756,16 @@ static int cma_bind_loopback(struct rdma_id_private *id_priv)
> ret = -ENODEV;
> goto out;
> }
> - list_for_each_entry(cma_dev, &dev_list, list)
> + list_for_each_entry(cma_dev, &dev_list, list) {
> + if (rdma_node_get_transport(cma_dev->device->node_type) !=
> + RDMA_TRANSPORT_IB)
> + continue;
> +
> for (p = 1; p <= cma_dev->device->phys_port_cnt; ++p)
> if (!ib_query_port(cma_dev->device, p, &port_attr) &&
> port_attr.state == IB_PORT_ACTIVE)
> goto port_found;
> + }
>
Here you need to:
ret = -ENODEV;
goto out;
instead of:
>
> p = 1;
> cma_dev = list_entry(dev_list.next, struct cma_device, list);
>
Otherwise it will still bind to the first device even if its iwarp...
With this mod, it works.
> @@ -1771,9 +1779,7 @@ port_found:
> if (ret)
> goto out;
>
> - id_priv->id.route.addr.dev_addr.dev_type =
> - (rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB) ?
> - ARPHRD_INFINIBAND : ARPHRD_ETHER;
> + id_priv->id.route.addr.dev_addr.dev_type = ARPHRD_INFINIBAND;
>
> rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid);
> ib_addr_set_pkey(&id_priv->id.route.addr.dev_addr, pkey);
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
More information about the ewg
mailing list