[ofa-general] post_recv question

Thu Feb 21 08:34:47 PST 2008

Sean Hefty wrote:
>>> I have a question regarding exactly _when_ a posted recv buffer is
>>> available for the HW to use:
>>>
>>> Consider that the post_recv methods usually just program a hw-specific
>>> WR in the RQ, then ring a doorbell, then return.  There is a delta
>>> period between when the app returns from the post_recv call and when the
>>> HW actually DMA's the WR and programs up the HW to enable that buffer.
>>> (I'm assumming a specific HW design here, but I _think_ most HW behaves
>>> this way?).
>>>
>>> If this is all true, then from the apps point of view, the buffer isn't
>>> really available when it returns from post_recv.  This can lead to
>>> conditions where the app advertises that recv buffer to the peer via
>>> some out of band channel, and the peer posts a SEND which arrives
>>> _before_ the HW has actually setup the RECV buffer.
> 
> I'm really not following the question here.  When you say that the app
> advertises the buffer, are you saying that it sends some sort of credit that a
> receive is posted?  

Yes.

> I would fully expect the receive buffer to be available to
> receive data before post_recv returns, but I not sure what race you're referring
> to.  Are you suggesting that this isn't the case?
> 

That is what I'm suggesting.

Here is the timing sequence:

t0: app calls post_recv
t1: post_recv code builds a hw-specific WR in the hw work queue
t2: post_recv code rings a doorbell (write to adapter mem or register)
t3: post_recv returns
t4: <app assumes the buffer is ready>
t5: device HW dma engine moves the WR to adapter memory
t6: device FW prepares the HW RQ entry making the buffer available.

Note at time t4, the application thinks its ready, but its really not 
ready until t6.

This clearly is a implementation-specific issue.  But I was under the 
assumption that all the RDMA HW behaves this way.  Maybe not?

To further complicate things, this race condition is never seen _if_ the 
application uses the same QP to advertise (send a credit allowing the 
peer to SEND) the RECV buffer availability.  So if the app posts a SEND 
after the RECV is posted and that SEND allows the peer access to the 
RECV buffer, then everything is ok.  This is due to the fact that the 
FW/HW will process the SEND only after processing the RECV.  If the app 
uses a different QP to post the SEND advertising the RECV, then the race 
condition exists allowing the peer to SEND into that RECV buffer before 
the HW makes it ready.

This all assumes a specific design of rdma hw.  Maybe nobody else has 
this issue?

Maybe I'm not making sense. :)

Steve.