[ofa-general] Application blocked in mthca_poll_cq
Bharath Ramesh
bramesh at vt.edu
Mon Nov 5 13:59:53 PST 2007
* Roland Dreier (rdreier at cisco.com) wrote:
> > I am not sure about the version of OFED being used, but its most likely
> > OFED-1.2. Is there any way to find the version of OFED used. libmthca.so
> > points to libmthca-rdmav2.so. I am not sure if this helps. My application is
> > multithreaded, every time this happens when I try to attach the process to
> > gdb I find that mthca_poll_cq is the one blocking and sometimes the call is
> > blocking on pthread_spin_unlock. Which is surprising as I wouldnt expect
> > pthread_spin_unlock to be blocking. I am sure that I am not doing any
> > use-after-free. I dont destroy the CQ till the application is terminating.
> > This situation occurs well before the application terminates.
>
> Yes, it's not really possible for pthread_spin_unlock() to block. In
> general (on any common architecture -- 32/64-bit x86, powerpc, ia64 --
> at least) pthread_spin_unlock() is just a single store to the spinlock
> memory location. What that says to me is that either gdb is giving
> you bogus information (quite possible) or perhaps your application is
> not really stuck -- it is just in a tight loop polling a CQ maybe?
I am looping around ibv_poll_cq on a tight loop. I think its quite
possible that I am looping but I do quit the loop once I dont have any
cq events. I will check that part of the code again.
>
> (BTW I think you should be able to determine the libmthca version by
> doing "rpm -qi libmthca")
The version of libmthca I am using is 1.0.4.
Thanks,
Bharath
>
> - R.
>
---
Bharath Ramesh <bramesh at vt.edu> http://people.cs.vt.edu/~bramesh
More information about the general
mailing list