[ofa-general] IPOB CM (NOSRQ) [PATCH V7] patch resubmit
Pradeep Satyanarayana
pradeeps at linux.vnet.ibm.com
Wed Jul 18 17:55:48 PDT 2007
Roland Dreier wrote:
> There's still some rather obvious problems with this patch. It would
> really help if you would read over your patch again I think... anyway:
>
> > +#define CM_PACKET_SIZE (1ul << 16)
>
> This duplicates IPOIB_CM_MTU I think... certainly it needs to be kept
> in sync with it somehow.
They are not quite the same. How about:
#define CM_PACKET_SIZE (ALIGN(IPOIB_CM_MTU, PAGE_SIZE))
This should keep the two in sync.
>
> > @@ -564,10 +574,9 @@ static inline void ipoib_cm_skb_too_long
> > dev_kfree_skb_any(skb);
> > }
> >
> > -static inline void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
> > +void ipoib_cm_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
>
> Why is this change here? (This is in the CONFIG_INFINIBAND_IPOIB_CM=n
> part of ipoib.h)
>
> > }
> > -
> > #endif
Will do
>
> > - ipoib_warn(priv, "post srq failed for buf %d (%d)\n", id, ret);
> > + ipoib_warn(priv, "post srq failed for buf %ld (%d)\n", id, ret);
>
> extra noise here (and still wrong -- id might be long long on some
> architectures).
Correct, it should have been %lld
>
> > - .event_handler = ipoib_cm_rx_event_handler,
>
> why? seems harmless to just leave this alone for all QPs even if an
> SRQ isn't attached.
>
If memory serves me right, I tried that and ran into some inexplicable problems.
Maybe it was hang or no traffic went through -don't exactly recollect what it was.
After this change the problem went away.
>
> > + spin_unlock_irq(&priv->lock);
> > + ipoib_warn(priv, "NOSRQ has reached the configurable limit "
> > + "of either %d RC QPs or, max recv buf size of "
> > + "0x%x MB\n", max_rc_qp, max_recv_buf);
>
> > + ib_send_cm_rej(cm_id, IB_CM_REJ_NO_QP, NULL, 0, NULL, 0);
> > + ret = -EINVAL;
> > + goto err_alloc_and_post;
>
> there's a bug here... you never undo the atomic_inc() of the number of
> RC QPs even though you exit without creating a new connection.
The atomic_dec() does happen, but that is in ipoib_cm_req_handler(). There are
several places where allocate_and_post_rbuf_nosrq() could return an error after
the atomic_inc(). So, there is an atomic_dec() in the calling routine. On the
other hand I could move that to allocate_and_post_rbuf_nosrq() itself.
>
> > - if (!p)
> > + if (!p) {
> > + printk(KERN_WARNING "Failed to allocate RX control block when "
> > + "REQ arrived\n");
> > return -ENOMEM;
> > + }
>
> more unrelated changes... (feel free to send these as separate
> patches)
>
OK
> > kfree(p);
> > }
> >
> > +
> > cancel_delayed_work(&priv->cm.stale_task);
> > }
>
> extra noise in the patch
>
> > + if (!priv->cm.srq) {
> > + atomic_dec(¤t_rc_qp);
> > + }
>
> no need for { } here
OK
>
> > + /* We increase the size of the CQ in the NOSRQ case to prevent CQ
> > + * overflow. Every new REQ creates a new RX QP and each QP has an
> > + * RX ring associated with it. Therefore we could have
> > + * NOSRQ_INDEX_TABLE_SIZE*ipoib_recvq_size + ipoib_sendq_size CQEs
> > + * in a CQ.
> > + */
> > + if (!priv->cm.srq)
> > + size += (NOSRQ_INDEX_TABLE_SIZE -1) * ipoib_recvq_size;
>
> only need to do this if CM is enabled
>
> space after - here please too.
>
> that's just from a quick skim of the patch...
>
OK
Pradeep
More information about the general
mailing list