[ewg] OFED-1.5.1 failure over iWarp
Eli Cohen
eli at dev.mellanox.co.il
Wed Feb 3 13:31:27 PST 2010
On Wed, Feb 03, 2010 at 03:10:40PM -0600, Steve Wise wrote:
> Eli Cohen wrote:
> >On Wed, Feb 03, 2010 at 02:28:05PM -0600, Steve Wise wrote:
> >>Here is the patched cma_acquire_dev() function. Where does it
> >>"build the gid in the pre rocee patches fashion and search again"
> >>for the iwarp case? Maybe I'm missing it?
> >>
> >>---------------
> >>static int cma_acquire_dev(struct rdma_id_private *id_priv)
> >>{
> >> struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
> >> struct cma_device *cma_dev;
> >> union ib_gid gid;
> >> int ret = -ENODEV;
> >>
> >> if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
> >> rocee_addr_get_sgid(dev_addr, &gid);
> >> list_for_each_entry(cma_dev, &dev_list, list) {
> >> ret = ib_find_cached_gid(cma_dev->device, &gid,
> >> &id_priv->id.port_num,
> >>NULL);
> >> if (!ret)
> >> break;
> >> }
> >> } else {
> >
> >here it is - it's the memcpy below:
> >
> How does it get here if it was already in the above block? IE it
> won't fall into this block, right?
Oops, you're right.
Please try this one:
commit 483fe703b03b1db99fa4a968fc3a918aa43f856f
Author: Eli Cohen <eli at mellanox.co.il>
Date: Wed Feb 3 13:10:14 2010 +0200
CMA: Fix iWarp failures to bind to a device
rdma_addr_get_sgid() relies on dev_addr->transport to retrieve the correct GID
based on the hardware address. However, when called from cma_acquire_dev(), the
transport field is not yet valid. The solution is to avoid calling
rdma_addr_get_sgid() from cma_acquire_dev() and find the device based on it's
GID: for ethernet, assume first it is rocee and search the GID table, if not
found generate the GID by copying it from the hardware address.
Signed-off-by: Eli Cohen <eli at mellanox.co.il>
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index a2d5aad..3c5c59f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -348,15 +348,29 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
union ib_gid gid;
int ret = -ENODEV;
- rdma_addr_get_sgid(dev_addr, &gid);
+ if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
+ rocee_addr_get_sgid(dev_addr, &gid);
+ list_for_each_entry(cma_dev, &dev_list, list) {
+ ret = ib_find_cached_gid(cma_dev->device, &gid,
+ &id_priv->id.port_num, NULL);
+ if (!ret)
+ goto out;
+ }
+ }
+
+ memcpy(&gid, dev_addr->src_dev_addr +
+ rdma_addr_gid_offset(dev_addr), sizeof gid);
list_for_each_entry(cma_dev, &dev_list, list) {
ret = ib_find_cached_gid(cma_dev->device, &gid,
&id_priv->id.port_num, NULL);
- if (!ret) {
- cma_attach_to_dev(id_priv, cma_dev);
+ if (!ret)
break;
- }
}
+
+out:
+ if (!ret)
+ cma_attach_to_dev(id_priv, cma_dev);
+
return ret;
}
>
> >> memcpy(&gid, dev_addr->src_dev_addr +
> >> rdma_addr_gid_offset(dev_addr), sizeof gid);
> >> list_for_each_entry(cma_dev, &dev_list, list) {
> >> ret = ib_find_cached_gid(cma_dev->device, &gid,
> >> &id_priv->id.port_num,
> >>NULL);
> >> if (!ret)
> >> break;
> >> }
> >> }
> >>
> >> if (!ret)
> >> cma_attach_to_dev(id_priv, cma_dev);
> >>
> >> return ret;
> >>}
> >>----------------
> >>
> >>
> >>
> >>Eli Cohen wrote:
> >>>On Wed, Feb 03, 2010 at 09:20:05AM -0600, Steve Wise wrote:
> >>>>>diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> >>>>>index a2d5aad..76dce2b 100644
> >>>>>--- a/drivers/infiniband/core/cma.c
> >>>>>+++ b/drivers/infiniband/core/cma.c
> >>>>>@@ -348,15 +348,28 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv)
> >>>>> union ib_gid gid;
> >>>>> int ret = -ENODEV;
> >>>>>- rdma_addr_get_sgid(dev_addr, &gid);
> >>>>>- list_for_each_entry(cma_dev, &dev_list, list) {
> >>>>>- ret = ib_find_cached_gid(cma_dev->device, &gid,
> >>>>>- &id_priv->id.port_num, NULL);
> >>>>>- if (!ret) {
> >>>>>- cma_attach_to_dev(id_priv, cma_dev);
> >>>>>- break;
> >>>>>+ if (dev_addr->dev_type != ARPHRD_INFINIBAND) {
> >>>>>+ rocee_addr_get_sgid(dev_addr, &gid);
> >>>>>+ list_for_each_entry(cma_dev, &dev_list, list) {
> >>>>>+ ret = ib_find_cached_gid(cma_dev->device, &gid,
> >>>>>+ &id_priv->id.port_num, NULL);
> >>>>>+ if (!ret)
> >>>>>+ break;
> >>>>>+ }
> >>>>The above if statement is true for iwarp devices, so this patch is
> >>>>just wrong. rocee__addr_get_sgid() should only be used for ROCEE
> >>>>interfaces, correct?
> >>>No, the idea is this: for non ARPHRD_INFINIBAND devices (e.g. rocee or
> >>>iwarp) I assume first this rocee, get the rocee gid, and check if this
> >>>gid appears in any device's gid table. It the mac address belongs to a
> >>>rocee device then it will be found; if it belongs to an iwarp device
> >>>then it won't be found. In the later case I build the gid in the pre
> >>>rocee patches fashion and search again.
> >>>>>+ } else {
> >>>>>+ memcpy(&gid, dev_addr->src_dev_addr +
> >>>>>+ rdma_addr_gid_offset(dev_addr), sizeof gid);
> >>>>>+ list_for_each_entry(cma_dev, &dev_list, list) {
> >>>>>+ ret = ib_find_cached_gid(cma_dev->device, &gid,
> >>>>>+ &id_priv->id.port_num, NULL);
> >>>>>+ if (!ret)
> >>>>>+ break;
> >>>>> }
> >>>>> }
> >>>>>+
> >>>>>+ if (!ret)
> >>>>>+ cma_attach_to_dev(id_priv, cma_dev);
> >>>>>+
> >>>>> return ret;
> >>>>>}
More information about the ewg
mailing list