[ofa-general] Re: [PATCH RFC/untested v0.5] IPoIB/CM: fix SRQ WR leak

Roland Dreier rdreier at cisco.com
Wed May 16 13:39:16 PDT 2007


 > + * - Put the QP in the Error State
 > + * - Wait for the Affiliated Asynchronous Last WQE Reached Event;
 > + * - either:
 > + *       drain the CQ by invoking the Poll CQ verb and either wait for CQ
 > + *       to be empty or the number of Poll CQ operations has exceeded
 > + *       CQ capacity size;
 > + * - or
 > + *       post another WR that completes on the same CQ and wait for this
 > + *       WR to return as a WC; (NB: this is the option that we use)
 > + * and then invoke a Destroy QP or Reset QP.

I guess this last line would look better as

 * - invoke a Destroy QP or Reset QP.

 > +static struct ib_qp_attr ipoib_cm_err_attr __read_mostly = {
 > +	.qp_state = IB_QPS_ERR
 > +};
 > +
 > +#define IPOIB_CM_RX_DRAIN_WRID 0x7fffffff
 > +
 > +static struct ib_send_wr ipoib_cm_rx_drain_wr __read_mostly = {
 > +	.wr_id = IPOIB_CM_RX_DRAIN_WRID,
 > +	.opcode = IB_WR_SEND
 > +};

I don't think these are hot enough to be worth marking as __read_mostly.
(better to leave them in normal .data so that stuff that is written to
ends up getting spaced out more)

 > +	qp_attr.qp_state = IB_QPS_INIT;
 > +	qp_attr.port_num = priv->port;
 > +	qp_attr.qkey = 0;
 > +	qp_attr.qp_access_flags = 0;
 > +	ret = ib_modify_qp(priv->cm.rx_drain_qp, &qp_attr,
 > +			   IB_QP_STATE | IB_QP_ACCESS_FLAGS | IB_QP_PORT | IB_QP_QKEY);
 > +	if (ret) {
 > +		ipoib_warn(priv, "failed to modify drain QP to INIT: %d\n", ret);
 > +		goto err_qp;
 > +	}
 > +
 > +	/* We put the QP in error state directly: this way, hardware
 > +	 * will immediately generate WC for each WR we post, without
 > +	 * sending anything on the wire. */
 > +	qp_attr.qp_state = IB_QPS_ERR;
 > +	ret = ib_modify_qp(priv->cm.rx_drain_qp, &qp_attr, IB_QP_STATE);
 > +	if (ret) {
 > +		ipoib_warn(priv, "failed to modify drain QP to error: %d\n", ret);
 > +		goto err_qp;
 > +	}

This actually seems like a good motivation for the mthca RESET ->
ERROR fix.  We could avoid the transition to INIT if we fixed mthca
and mlx4, right?  (By the way, any interest in making an mlx4 patch to
fix that too?)

 - R.



More information about the general mailing list