[openib-general] [PATCH] process locked in D state.
Gleb Natapov
glebn at voltaire.com
Mon Jun 27 01:27:45 PDT 2005
Hello,
Summary:
If I call ibv_close_device() on the context with unregistered memory
process hangs in D state.
This is what happens:
ibv_close_device() close cmd_fd and then calls free_context().
free_context() calls munmap to unmap doorbell registers.
In kernel sys_munmap gets mm->mmap_sem semaphore and calls do_munmap.
do_munmap is the last user of the file so it calls release method of
the file (ib_uverbs_close() in our case). ib_uverbs_close() calls
ib_dealloc_ucontext(). ib_dealloc_ucontext() notices that there is
unregistered memory on the file and calls ib_umem_release(). And there
we are trying to acquire mm->mmap_sem on more time.
One way to solve this problem is to call munmap before close in
ibv_close_device() but this will not stop malicious user so this is
not enough.
In attached patch I use down_write_trylock() instead of down_write()
in ib_umem_release(). If semaphore is already locked we will not update
locked_vm statistics. This way malicious user can only cause harm to
itself.
The solution is not ideal since if some other process holds mmap_sem
(for instance do 'cat /proc/pid/maps') we will not be able to update
locked_vm counter but the chances this happening are close to zero.
Index: trunk/src/userspace/libibverbs/src/device.c
===================================================================
--- trunk/src/userspace/libibverbs/src/device.c (revision 2715)
+++ trunk/src/userspace/libibverbs/src/device.c (working copy)
@@ -121,10 +121,9 @@
close(context->async_fd);
for (i = 0; i < context->num_comp; ++i)
close(context->cq_fd[i]);
+ context->device->ops.free_context(context);
close(context->cmd_fd);
- context->device->ops.free_context(context);
-
return 0;
}
Index: trunk/src/linux-kernel/infiniband/core/uverbs_mem.c
===================================================================
--- trunk/src/linux-kernel/infiniband/core/uverbs_mem.c (revision 2715)
+++ trunk/src/linux-kernel/infiniband/core/uverbs_mem.c (working copy)
@@ -163,18 +163,19 @@
void ib_umem_release(struct ib_device *dev, struct ib_umem *umem)
{
struct mm_struct *mm;
+ int semlocked = 0;
mm = get_task_mm(current);
- if (mm) {
- down_write(&mm->mmap_sem);
+ if (mm && (semlocked = down_write_trylock (&mm->mmap_sem))) {
mm->locked_vm -= PAGE_ALIGN(umem->length + umem->offset) >> PAGE_SHIFT;
}
__ib_umem_release(dev, umem, 1);
if (mm) {
- up_write(&mm->mmap_sem);
+ if (semlocked)
+ up_write(&mm->mmap_sem);
mmput(mm);
}
}
--
Gleb.
-------------- next part --------------
Index: trunk/src/userspace/libibverbs/src/device.c
===================================================================
--- trunk/src/userspace/libibverbs/src/device.c (revision 2715)
+++ trunk/src/userspace/libibverbs/src/device.c (working copy)
@@ -121,10 +121,9 @@
close(context->async_fd);
for (i = 0; i < context->num_comp; ++i)
close(context->cq_fd[i]);
+ context->device->ops.free_context(context);
close(context->cmd_fd);
- context->device->ops.free_context(context);
-
return 0;
}
Index: trunk/src/linux-kernel/infiniband/core/uverbs_mem.c
===================================================================
--- trunk/src/linux-kernel/infiniband/core/uverbs_mem.c (revision 2715)
+++ trunk/src/linux-kernel/infiniband/core/uverbs_mem.c (working copy)
@@ -163,18 +163,19 @@
void ib_umem_release(struct ib_device *dev, struct ib_umem *umem)
{
struct mm_struct *mm;
+ int semlocked = 0;
mm = get_task_mm(current);
- if (mm) {
- down_write(&mm->mmap_sem);
+ if (mm && (semlocked = down_write_trylock (&mm->mmap_sem))) {
mm->locked_vm -= PAGE_ALIGN(umem->length + umem->offset) >> PAGE_SHIFT;
}
__ib_umem_release(dev, umem, 1);
if (mm) {
- up_write(&mm->mmap_sem);
+ if (semlocked)
+ up_write(&mm->mmap_sem);
mmput(mm);
}
}
More information about the general
mailing list