[ewg] Re: [PATCH] IPOIB/CM fixes for issues seen in OFED-1.3
Eli Cohen
eli at dev.mellanox.co.il
Tue Feb 12 01:08:54 PST 2008
Pradeep Satyanarayana wrote:
> The following patch incorporates fixes for several issues:
> 1. Fail to destroy ipoib rx QP (https://bugs.openfabrics.org/show_bug.cgi?id=906)
> This fixes the usecnt issue and allows the qp to be destroyed.
> 2. Change retry counts to small values. This helps interoperability
> between ehca and mthca.
> 3. While looking through the code, I found an error introduced by the split cq
> patch in the ipoib_poll(). This undoes the change.
>
> Please include for the OFED-1.3 rc5 build. This patch was tested on today's build
> on ehca and mthca on ppc64 machines. I have done some tests with network traffic
> and also loads and unloads of modules and seen no issues.
>
>
> Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
> ---
>
> --- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_cm.c 2008-02-11 14:28:47.000000000 -0500
> +++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_cm.c 2008-02-11 15:05:48.000000000 -0500
> @@ -881,11 +881,11 @@ void ipoib_cm_dev_stop(struct net_device
> ipoib_warn(priv, "RX drain timing out\n");
>
> /*
> - * assume the HW is wedged and just free up everything.
> + * assume errors and move to rx_reap list.
> */
> - list_splice_init(&priv->cm.rx_flush_list, &list);
> - list_splice_init(&priv->cm.rx_error_list, &list);
> - list_splice_init(&priv->cm.rx_drain_list, &list);
> + list_splice_init(&priv->cm.rx_flush_list, &priv->cm.rx_reap_list);
> + list_splice_init(&priv->cm.rx_error_list, &priv->cm.rx_reap_list);
> + list_splice_init(&priv->cm.rx_drain_list, &priv->cm.rx_reap_list);
> break;
> }
> spin_unlock_irq(&priv->lock);
> @@ -1016,8 +1016,8 @@ static int ipoib_cm_send_req(struct net_
> req.responder_resources = 4;
> req.remote_cm_response_timeout = 20;
> req.local_cm_response_timeout = 20;
> - req.retry_count = 0; /* RFC draft warns against retries */
> - req.rnr_retry_count = 0; /* RFC draft warns against retries */
> + req.retry_count = 3;
> + req.rnr_retry_count = 3;
> req.max_cm_retries = 15;
> req.srq = ipoib_cm_has_srq(dev);
> return ib_send_cm_req(id, &req);
> --- ofa_kernel-1.3_a/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2008-02-11 14:28:47.000000000 -0500
> +++ ofa_kernel-1.3_b/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2008-02-11 14:49:24.000000000 -0500
> @@ -405,8 +405,12 @@ poll_more:
> ipoib_cm_handle_rx_wc(dev, wc);
> else
> ipoib_ib_handle_rx_wc(dev, wc);
> - } else
> - ipoib_cm_handle_tx_wc(priv->dev, wc);
> + } else {
> + if (wc->wr_id & IPOIB_OP_CM)
> + ipoib_cm_handle_tx_wc(priv->dev, wc);
> + else
> + ipoib_ib_handle_tx_wc(dev, wc);
> + }
Omitting the call to ipoib_ib_handle_tx_wc was done deliberately since ud tx completions
are not reported to this cq since the split cq patch has been introduced.
> }
>
> if (n != t)
>
More information about the ewg
mailing list