[openib-general] posting send requests in RTR

Rimmer, Todd trimmer at silverstorm.com
Fri Jul 28 06:50:07 PDT 2006


> From: Rimmer, Todd
> 
> > From: Sean Hefty [mailto:sean.hefty at intel.com]
> >
> > >That assumes that there is any valid reason for an application to
> > >post send requests before the connection is established. While
there
> > >is clearly a need to post receive work requests before the
connection
> > >is established I cannot think of any reason why an application
needs
> > >to pre-prime the send queue.
> >
> > It's not pre-priming the send queue.  An application could pull a
> > completed
> > receive work completion off of a CQ.  The receive may very well be a
> > request
> > that requires a response.  At this point, the connection is
obviously
> > established from the consumers viewpoint, but not necessarily by the
> > viewpoint
> > of the RDMA CM or IB CM.  The response must now be queued.
> >
> > I believe that the problem can be limited under the following
> application
> > conditions:
> >
> > 1. The application uses the CQ with different QPs.
> > 2. The application is on the passive (server) side of the
connection.
> > 3. The active (client) side sends a request to the server.
> >
> > Even combined these conditions could easily occur.
> >
> > IMO, the question is do we pass this problem to the applications to
> deal
> > with,
> > or try to handle transparently it under verbs.  If we try to handle
it
> > under
> > verbs, can it be done in one place?  How much would such checks
> affects
> > the
> > performance of post send operations?  And how would immediate or
other
> > errors be
> > handled when posting queued sends?
> >
> > My personal take at the moment is to let the ULPs handle the
problem.
> >
> I feel this is best solved in the verbs driver and will add no more
than
> 1 QP state test to the data path.  The verbs driver will need to test
> the QP state, if its RTR, it should process as much of the WQE as it
can
> without notifying the HCA.  For Mellanox silicon, this would mean
build
> the WQE but don't ring the doorbell.  Theoretically other HCAs may
> require other algorithms (such as a sidebar TX queue).  Since
> PathScale/QLogic HCA does QP management in software, its solution
should
> be soft as well.  Not sure about IBM eHCA.
> 
> Immediate errors would be tested for as they are presently.  Any
> immediate error tests would precede this test.
> 
> On transition from RTR to RTS, the verbs driver would appropriately
> notify the HCA of the queued send WQEs.  For Mellanox HCA this would
> involve ringing the appropriate doorbell.
> 
> It is important that we keep the writing of applications simple.
> Requiring applications and ULPs to solve subtle races like this almost
> guarantees they won't be solved.  As Open Fabrics use expands we will
> find more developers implementing to the APIs.  The easier we make it,
> the more likely Open Fabrics use will expand.
> 
> Todd Rimmer
> 

One addition, also poll_cq would need to test for an error state.  In
which case it would simulate the Flushed events for those queued SendQ
WQEs.

Todd Rimmer




More information about the general mailing list