[openib-general] posting send requests in RTR

Fri Jul 28 11:40:55 PDT 2006

Rimmer, Todd wrote:
>> From: Caitlin Bestler [mailto:caitlinb at broadcom.com]
>> 
>> That assumes that there is any valid reason for an application to
>> post send requests before the connection is established. While there
>> is clearly a need to post receive work requests before the
>> connection is established I cannot think of any reason why an
>> application needs to pre-prime the send queue. 
>> 
>> Putting unneeded complexity in the definition of a hardware service
>> just invites more hardware dependencies and eventual hardware
>> specific bugs that will complicate life for application developers.
>> "Don't post until the connection is established" is very simple for
>> *both* the application and the verbs provider.
> 
> Here is a real world example and how we uncovered this issue:
> Native IB storage SRP Targets.
> 
> SRP Targets implement the passive side, after processing the
> REQ, they send a REP.  However the target QP is in RTR.
> 
> The Srp client gets the REP, sends the RTU and announces to
> the OS that a device is available.  The OS immediately begins issuing
> SCSI commands. 
> 
> The target receives the SCSI commands (such as Test Unit Ready or
> Inquiry) and wants to act on them immediately.  However if
> the command has passed the RTU or the RTU is lost, the target
> is still in RTR.  If the command was very simple, the target
> may want to answer the query immediately by posting a send
> with the response.  If the RTU is totally lost and async
> event processing is delayed, the target may even be able to
> do some processing and still not have the QP in RTS when it
> has its response ready.
> 
> While its possible to build an additional queuing point above
> the send Q, such queuing points tend to impact performance
> and latency for high performance protocols.
> 
> While this example was for SRP, its not unique to SRP.  Most
> protocols (such as SRP and SDP) include a set of
> initialization or capability query messages which the client
> may issue to the target immediately after the client believes
> it has a connection.  Many of those initialization messages
> are the types of things which the target can answer
> immediately and may even choose to do so in its completion handler.
> 

When the QP is ready to accept postings to the Send Queue it should
report Connection Established to the Consumer. Before it has reported
that, any attempt by the Consumer to post should return an error.

The problem here is that you are informing the Consumer that the
connection is established before you are informing the QP itself.

Well effectively you are, because the application naturally assumes
that the RDMA device cannot successfully complete a receive work
request until after the connection is established. And in fact
that assumption is correct.

I'm not concerned with how IB drivers resolve this issue, what I
am concerned is that this fix does not create any sort of expectation
that the RDMA Device MUST accept and queue "early" send requests.