[ofa-general] missed cq event

Pete Wyckoff pw at osc.edu
Tue Jun 10 11:26:55 PDT 2008


rdreier at cisco.com wrote on Tue, 10 Jun 2008 07:56 -0700:
>  > Is it possible that there is a bug in OFED 1.3 with regard to non-signaled 
>  > send work requests?
>  > I noticed that when I post send work requests onto my send queue, It 
>  > eventually fills up until I
>  > cannot post sends anymore.
>  > This happens with the Chelsio T3 RNIC and OFED 1.3 whenever I post send 
>  > WR's that have
>  > their flags set to 0. It does not happen though when I post sends with 
>  > IBV_SEND_SIGNALED.
>  > The CQ is empty in the case of non-signaled WR's (as expected) but they 
>  > somehow seem to
>  > be stuck on the send queue.
> 
> This is not a bug.  The unsignaled work requests are not considered
> completed until a later signaled work request completes.  So you need to
> periodically post a signaled work request, so that you know when the
> unsignaled sends have completed.

You know, I've never quite liked this fact.  I've seen multiple
people, including myself, get tripped up by this issue.  And now we
have lots of applications with code to:  determine the queue depth,
increment a counter on every send and periodically post a SIGNALED
one, and handle the uninteresting CQ event.

Trying not to think about how the hardware works for a moment, this
issue can be seen as yet another little detail that apps people must
learn to use RDMA devices.  Any ideas on how verbs could handle this
case itself?

Sorry for the rant.

		-- Pete



More information about the general mailing list