[openib-general] [libmthca] deadlock while trying to destroy QP

guyg guyg at Voltaire.COM
Mon Feb 5 08:43:14 PST 2007


Hi Roland,

I am running a proprietary test over ofed1.1 (userspace).

I have one context where I poll my cq and another (signal handler 
context) where I try to destroy my QP.

It looks like mthca_destroy_qp is trying to take a lock that 
mthca_poll_cq is holding.

The deadlock is occurring at the end of the test run where there 
are no more completions, hence deadlocking and the test never exists.

Here is a core dump:

#0  0x0000003a6ce09172 in pthread_spin_lock () from /lib64/tls/libpthread.so.0
#1  0x0000002a959cf449 in mthca_cq_clean (cq=0x607240, qpn=3277830, srq=0x0) at src/cq.c:554
#2  0x0000002a959d28b9 in mthca_destroy_qp (qp=0x607400) at src/mthca.h:246
#3  0x000000000040117b in client_sig_handler ()
#4  <signal handler called>
#5  0x0000003a6ce09165 in pthread_spin_lock () from /lib64/tls/libpthread.so.0
#6  0x0000002a959cec91 in mthca_poll_cq (ibcq=0x607240, ne=1, wc=0x7fbffff590) at src/cq.c:467
#7  0x0000002a9557bf73 in ibv_poll_cq (cq=0x607240, num_entries=1, wc=0x7fbffff590) at /usr/local/ofed/include/infiniband/verbs.h:824


Does destroy_qp needs to be dependent on the CQ?

Do you have any suggestions?

Thanks,
Guy




More information about the general mailing list