[ofa-general] post_recv question
glebn at voltaire.com
Thu Feb 21 11:31:11 PST 2008
On Thu, Feb 21, 2008 at 11:10:24AM -0800, Ralph Campbell wrote:
> > > To further complicate things, this race condition is never seen _if_
> > the
> > > application uses the same QP to advertise (send a credit allowing
> > the
> > > peer to SEND) the RECV buffer availability. So if the app posts a
> > SEND
> > > after the RECV is posted and that SEND allows the peer access to
> > the
> > > RECV buffer, then everything is ok. This is due to the fact that
> > the
> > > FW/HW will process the SEND only after processing the RECV. If the
> > app
> > > uses a different QP to post the SEND advertising the RECV, then the
> > race
> > > condition exists allowing the peer to SEND into that RECV buffer
> > before
> > > the HW makes it ready.
> Well, there is no guarantee that the HCA processes the post_recv()
> before the post_send() even on the same QP. Send and receive are
> unordered with respect to each other. The fact that it works is
> an HCA specific implementation artifact.
So there is no way to implement SW flow control over Infiniband? How
is that IB spec has SW flow control specification for SDP in it then?
> > > This all assumes a specific design of rdma hw. Maybe nobody else
> > has
> > > this issue?
> > >
> > > Maybe I'm not making sense. :)
> > I think your descriptions here match what Ralph found RNR in IPoIB-CM.
> > Ralph,
> > Does this make sense?
> > Thanks
> > Shirley
> I think you are making sense. There is an indeterminate race
> between post_recv() returning to the application and when
> a packet being received by the HCA might be able to use
> that buffer. There are no ordering guarantees
> between messages sent on one QP and another so the application
> can't easily use a different QP to advertise posted buffers (credits).
If after post_recv() returns it is guarantied that receive buffers are
available to HW we don't need ordering guaranties between QPs to
successfully implement SW flow control.
> That is why the IB RC protocol does this for you in band if the RC QP
> is using a dedicated receive queue but not a shared receive queue.
What do you mean by that? RNR works for both RC and SRQ QPs.
More information about the general