[openib-general] Re: [PATCH] process locked in D state.
Michael S. Tsirkin
mst at mellanox.co.il
Tue Jun 28 12:17:10 PDT 2005
Quoting r. Roland Dreier <rolandd at cisco.com>:
> Subject: Re: [PATCH] process locked in D state.
>
> I would prefer to stick
> with the current code, which has at worst a short window where
> vm_locked is too high.
Right, but it seems this introduces the race for apps doing deregister_mr
properly, too, which used to work fine.
> I don't like making it possible for a buggy
> userspace process to leak vm_locked (even though it is technically the
> process's own fault).
>
> - R.
>
Right, but at least lets not introduce the hole for the good case.
When I call deregister_mr I have the right to expect that I
can lock the memory immediately after that.
So deregister mr should always wait till it has the semaphore.
In the case of close, its the buggy app that suffers, and I'm ok
with handling it in a work queue since you think its worth it -
but lets not try to optimise that case, it should be ok to always
use the workqueue.
---
Split the code handling the well-behaved and the misbehaved app,
to avoid races for the good case, and make behaviour more consistent
for the bad one.
Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>
Index: core/uverbs_mem.c
===================================================================
--- core/uverbs_mem.c (revision 2732)
+++ core/uverbs_mem.c (working copy)
@@ -185,35 +185,42 @@ void ib_umem_release(struct ib_device *d
__ib_umem_release(dev, umem, 1);
mm = get_task_mm(current);
- if (mm) {
- /*
- * We may be called with the mm's mmap_sem already
- * held. This can happen when a userspace munmap() is
- * the call that drops the last reference to our file
- * and calls our release method. If there are memory
- * regions to destroy, we'll end up here and not be
- * able to take the mmap_sem.
- *
- * To handle this, we try to grab the mmap_sem, and if
- * we can't get it immediately, we defer the
- * accounting to the system workqueue.
- */
- if (down_write_trylock(&mm->mmap_sem)) {
- mm->locked_vm -= PAGE_ALIGN(umem->length + umem->offset) >> PAGE_SHIFT;
- up_write(&mm->mmap_sem);
- mmput(mm);
- } else {
- struct ib_umem_account_work *work;
-
- work = kmalloc(sizeof *work, GFP_KERNEL);
- if (!work)
- return;
-
- INIT_WORK(&work->work, ib_umem_account, work);
- work->mm = mm;
- work->diff = PAGE_ALIGN(umem->length + umem->offset) >> PAGE_SHIFT;
-
- schedule_work(&work->work);
- }
- }
+ if (!mm)
+ return;
+
+ down_write(&mm->mmap_sem);
+ mm->locked_vm -= PAGE_ALIGN(umem->length + umem->offset) >> PAGE_SHIFT;
+ up_write(&mm->mmap_sem);
+ mmput(mm);
+}
+
+void ib_umem_put(struct ib_device *dev, struct ib_umem *umem)
+{
+ struct ib_umem_account_work *work;
+ struct mm_struct *mm;
+
+ __ib_umem_release(dev, umem, 1);
+
+ mm = get_task_mm(current);
+ if (!mm)
+ return;
+
+ /*
+ * We may be called with the mm's mmap_sem already
+ * held. This can happen when a userspace munmap() is
+ * the call that drops the last reference to our file
+ * and calls our release method. If there are memory
+ * regions to destroy, we'll end up here and not be
+ * able to take the mmap_sem.
+ */
+
+ work = kmalloc(sizeof *work, GFP_KERNEL);
+ if (!work)
+ return;
+
+ INIT_WORK(&work->work, ib_umem_account, work);
+ work->mm = mm;
+ work->diff = PAGE_ALIGN(umem->length + umem->offset) >> PAGE_SHIFT;
+
+ schedule_work(&work->work);
}
Index: core/uverbs_main.c
===================================================================
--- core/uverbs_main.c (revision 2732)
+++ core/uverbs_main.c (working copy)
@@ -133,7 +133,7 @@ static int ib_dealloc_ucontext(struct ib
struct ib_umem_object *memobj;
memobj = container_of(uobj, struct ib_umem_object, uobject);
- ib_umem_release(mr->device, &memobj->umem);
+ ib_umem_put(mr->device, &memobj->umem);
idr_remove(&ib_uverbs_mr_idr, uobj->id);
ib_dereg_mr(mr);
Index: core/uverbs.h
===================================================================
--- core/uverbs.h (revision 2732)
+++ core/uverbs.h (working copy)
@@ -103,6 +103,7 @@ void ib_uverbs_qp_event_handler(struct i
int ib_umem_get(struct ib_device *dev, struct ib_umem *mem,
void *addr, size_t size, int write);
+void ib_umem_put(struct ib_device *dev, struct ib_umem *umem);
void ib_umem_release(struct ib_device *dev, struct ib_umem *umem);
#define IB_UVERBS_DECLARE_CMD(name) \
--
MST
More information about the general
mailing list