[ofa-general] Application blocked in mthca_poll_cq

Bharath Ramesh bramesh at vt.edu
Mon Nov 5 13:59:53 PST 2007


* Roland Dreier (rdreier at cisco.com) wrote:
>  > I am not sure about the version of OFED being used, but its most likely
>  > OFED-1.2. Is there any way to find the version of OFED used. libmthca.so
>  > points to libmthca-rdmav2.so. I am not sure if this helps. My application is
>  > multithreaded, every time this happens when I try to attach the process to
>  > gdb I find that mthca_poll_cq is the one blocking and sometimes the call is
>  > blocking on pthread_spin_unlock. Which is surprising as I wouldnt expect
>  > pthread_spin_unlock to be blocking. I am sure that I am not doing any
>  > use-after-free. I dont destroy the CQ till the application is terminating.
>  > This situation occurs well before the application terminates.
> 
> Yes, it's not really possible for pthread_spin_unlock() to block.  In
> general (on any common architecture -- 32/64-bit x86, powerpc, ia64 --
> at least) pthread_spin_unlock() is just a single store to the spinlock
> memory location.  What that says to me is that either gdb is giving
> you bogus information (quite possible) or perhaps your application is
> not really stuck -- it is just in a tight loop polling a CQ maybe?

I am looping around ibv_poll_cq on a tight loop. I think its quite
possible that I am looping but I do quit the loop once I dont have any
cq events. I will check that part of the code again.

> 
> (BTW I think you should be able to determine the libmthca version by
> doing "rpm -qi libmthca")

The version of libmthca I am using is 1.0.4.

Thanks,

Bharath

> 
>  - R.
> 

---
Bharath Ramesh       <bramesh at vt.edu>       http://people.cs.vt.edu/~bramesh




More information about the general mailing list