[ofa-general] Re: IPOIB CM (NOSRQ)[PATCH V3] patch for review
Pradeep Satyanarayana
pradeep at us.ibm.com
Thu May 3 11:14:56 PDT 2007
"Michael S. Tsirkin" <mst at dev.mellanox.co.il> wrote on 05/02/2007 08:55:47
PM:
> > > > + if (ipoib_cm_post_receive(dev, i << 32 | index)) {
> > >
> > > 1. It seems there are multiple QPs mapped to a single CQ -
> > > and each gets ipoib_recvq_size recv WRs above.
> > > Is that right? How do you prevent CQ overrun then?
> >
> > Good point! Looking at the IB spec it appears that upon CQ overflow
> > it results in a Local Work Queue catastrophic error and puts the QP
> > (receiver side) in an error state.
>
> Look further in spec - you get CQ error, too.
>
> > Hence, I am speculating that the
> > sending side will see an error. This will result in the sending side
> > destroying the QP and sending a DREQ message which, will remove the
> > receive side QP.
> >
> > A new set of QPs will be created on the send side (this is RC) and
> > the connection setup starts over again. It will continue, but at a
> > degraded rate.
> > Is this correct? What other alternative do you suggest
> > -create a CQ per QP? Is the max number of CQs an issue to consider, if
> > we adopt this approach?
>
> We were switching to NAPI though, and NAPI kind of forces you to use
> a common CQ, I think.
What if in ipoib_transport_dev_init() size is changed to something like:
size = ipoib_sendq_size + NOSRQ_INDEX_RING_SIZE * ipoib_recvq_size + 1;
used by ib_create_cq() call for the NOSRQ case only? Yes, we will end up
consuming a lot more memory -do you see any (other) problems with that?
Pradeep
pradeep at us.ibm.com
More information about the general
mailing list