[ofa-general] hang on close in umem_release

Thu Jun 21 08:25:44 PDT 2007

With 2.6.22-rc5, I get a repeatable D state hang of a user space
process upon termination (ctrl-C).  x86_64 SMP, no preempt.

Here's the sysrq-T trace:

app           D ffff81003ec17220     0  2841   2780 (NOTLB)
 ffff81003cec7d78 0000000000000082 ffffffff80227aa0 ffff81003cec7d78
 ffff81003ec17220 ffffffff804d8380 000000000002a161 ffff81003ec173d0
 0000000000000001 0000000100085088 0000000000000001 ffff81003ff2bb40
Call Trace:
 [<ffffffff80227aa0>] default_wake_function+0x0/0x10
 [<ffffffff8025735d>] unlock_page+0x2d/0x40
 [<ffffffff803f1da5>] __down_write_nested+0x85/0xc0
 [<ffffffff803f1deb>] __down_write+0xb/0x10
 [<ffffffff80245039>] down_write+0x9/0x10
 [<ffffffff880919d5>] :ib_core:ib_umem_release+0x75/0x110
 [<ffffffff880f6f6e>] :ib_mthca:mthca_free_mr+0x6e/0xe0
 [<ffffffff880fdb15>] :ib_mthca:mthca_dereg_mr+0x25/0x40
 [<ffffffff8808defd>] :ib_core:ib_dereg_mr+0x2d/0x40
 [<ffffffff8810e78c>] :ib_uverbs:ib_uverbs_close+0x2ac/0x380
 [<ffffffff80282df3>] __fput+0xb3/0x1a0
 [<ffffffff80282f66>] fput+0x16/0x20
 [<ffffffff8028001b>] filp_close+0x4b/0x80
 [<ffffffff802815ec>] sys_close+0x9c/0x100
 [<ffffffff80209b4e>] system_call+0x7e/0x83

It should have open an fd for the rdmacm event channel, and an fd
for the CQ event channel, but does not have any connected QPs at
this point (although it did in the past) and no registered memory
regions, although maybe the app forgot to free one?

Apparently it is here:

        /*
         * We may be called with the mm's mmap_sem already held.  This
         * can happen when a userspace munmap() is the call that drops
         * the last reference to our file and calls our release
         * method.  If there are memory regions to destroy, we'll end
         * up here and not be able to take the mmap_sem.  In that case
         * we defer the vm_locked accounting to the system workqueue.
         */
        if (context->closing && !down_write_trylock(&mm->mmap_sem)) {
                INIT_WORK(&umem->work, ib_umem_account);
                umem->mm   = mm;
                umem->diff = diff;

                schedule_work(&umem->work);
                return;
        } else 
                down_write(&mm->mmap_sem);

stuck in the down_write on mmap_sem.  Thus context->closing must not
be true.

Is this a known problem?  Is there some more information I can
give you?

		-- Pete