[openib-general] [libmthca] deadlock while trying to destroy QP
guyg
guyg at Voltaire.COM
Mon Feb 5 08:43:14 PST 2007
Hi Roland,
I am running a proprietary test over ofed1.1 (userspace).
I have one context where I poll my cq and another (signal handler
context) where I try to destroy my QP.
It looks like mthca_destroy_qp is trying to take a lock that
mthca_poll_cq is holding.
The deadlock is occurring at the end of the test run where there
are no more completions, hence deadlocking and the test never exists.
Here is a core dump:
#0 0x0000003a6ce09172 in pthread_spin_lock () from /lib64/tls/libpthread.so.0
#1 0x0000002a959cf449 in mthca_cq_clean (cq=0x607240, qpn=3277830, srq=0x0) at src/cq.c:554
#2 0x0000002a959d28b9 in mthca_destroy_qp (qp=0x607400) at src/mthca.h:246
#3 0x000000000040117b in client_sig_handler ()
#4 <signal handler called>
#5 0x0000003a6ce09165 in pthread_spin_lock () from /lib64/tls/libpthread.so.0
#6 0x0000002a959cec91 in mthca_poll_cq (ibcq=0x607240, ne=1, wc=0x7fbffff590) at src/cq.c:467
#7 0x0000002a9557bf73 in ibv_poll_cq (cq=0x607240, num_entries=1, wc=0x7fbffff590) at /usr/local/ofed/include/infiniband/verbs.h:824
Does destroy_qp needs to be dependent on the CQ?
Do you have any suggestions?
Thanks,
Guy
More information about the general
mailing list