[openib-general] Re: usermode hang in mthca_cq_clean
Roland Dreier
rolandd at cisco.com
Wed Nov 9 12:12:03 PST 2005
Sean> I'm seeing an issue trying to recover from an error in
Sean> userspace. Basically, I allocate a PD, a CQ, and a QP, then
Sean> destroy the QP because of an unrelated error. The destroy
Sean> call takes several seconds to complete, and appears to be
Sean> hung in mthca_cq_clean: line 551. Stepping through the
Sean> while loop there, I'm not falling into the if or else if
Sean> cases. The call does eventually complete.
I think I see the problem. Does this patch fix it for you?
(basically you're doing a benchmark seeing how fast your CPU can go
through the loop 4 billion times ;)
- R.
--- libmthca/src/cq.c (revision 3989)
+++ libmthca/src/cq.c (working copy)
@@ -524,7 +524,7 @@ void mthca_arbel_cq_event(struct ibv_cq
void mthca_cq_clean(struct mthca_cq *cq, uint32_t qpn, struct mthca_srq *srq)
{
struct mthca_cqe *cqe;
- int prod_index;
+ uint32_t prod_index;
int nfreed = 0;
pthread_spin_lock(&cq->lock);
@@ -546,7 +546,7 @@ void mthca_cq_clean(struct mthca_cq *cq,
* Now sweep backwards through the CQ, removing CQ entries
* that match our QP by copying older entries on top of them.
*/
- while (--prod_index > cq->cons_index) {
+ while ((int) --prod_index - (int) cq->cons_index >= 0) {
cqe = get_cqe(cq, prod_index & cq->ibv_cq.cqe);
if (cqe->my_qpn == htonl(qpn)) {
if (srq)
More information about the general
mailing list