[ofa-general] post_recv question

Thu Feb 21 09:39:58 PST 2008

On Thu, 21 Feb 2008, Steve Wise wrote:

> Sean Hefty wrote:
> > > > I have a question regarding exactly _when_ a posted recv buffer is
> > > > available for the HW to use:
> > > > 
> > > > Consider that the post_recv methods usually just program a hw-specific
> > > > WR in the RQ, then ring a doorbell, then return.  There is a delta
> > > > period between when the app returns from the post_recv call and when the
> > > > HW actually DMA's the WR and programs up the HW to enable that buffer.
> > > > (I'm assumming a specific HW design here, but I _think_ most HW behaves
> > > > this way?).
> > > > 
> > > > If this is all true, then from the apps point of view, the buffer isn't
> > > > really available when it returns from post_recv.  This can lead to
> > > > conditions where the app advertises that recv buffer to the peer via
> > > > some out of band channel, and the peer posts a SEND which arrives
> > > > _before_ the HW has actually setup the RECV buffer.
> > 
> > I'm really not following the question here.  When you say that the app
> > advertises the buffer, are you saying that it sends some sort of credit that
> > a
> > receive is posted?  
> 
> Yes.
> 
> > I would fully expect the receive buffer to be available to
> > receive data before post_recv returns, but I not sure what race you're
> > referring
> > to.  Are you suggesting that this isn't the case?
> > 
> 
> That is what I'm suggesting.
> 
> Here is the timing sequence:
> 
> t0: app calls post_recv
> t1: post_recv code builds a hw-specific WR in the hw work queue
> t2: post_recv code rings a doorbell (write to adapter mem or register)
> t3: post_recv returns
> t4: <app assumes the buffer is ready>
> t5: device HW dma engine moves the WR to adapter memory
> t6: device FW prepares the HW RQ entry making the buffer available.
> 
> Note at time t4, the application thinks its ready, but its really not ready
> until t6.
> 
> This clearly is a implementation-specific issue.  But I was under the
> assumption that all the RDMA HW behaves this way.  Maybe not?
> 
> To further complicate things, this race condition is never seen _if_ the
> application uses the same QP to advertise (send a credit allowing the peer to
> SEND) the RECV buffer availability.  So if the app posts a SEND after the RECV
> is posted and that SEND allows the peer access to the RECV buffer, then
> everything is ok.  This is due to the fact that the FW/HW will process the
> SEND only after processing the RECV.  If the app uses a different QP to post
> the SEND advertising the RECV, then the race condition exists allowing the
> peer to SEND into that RECV buffer before the HW makes it ready.
> 
> This all assumes a specific design of rdma hw.  Maybe nobody else has this
> issue?
> 
> Maybe I'm not making sense. :)
> 

I'm following you.

Applications assume that when post_recv() returns the RECV WR is on 
the queue. There is no API for the RDMA application writer to query 
the availability/eligibility of the RECV, so this is a reasonable and 
necessary assumption.