[openib-general] Re: [PATCH] process locked in D state.
Gleb Natapov
glebn at voltaire.com
Tue Jun 28 00:50:57 PDT 2005
On Tue, Jun 28, 2005 at 10:22:22AM +0300, Michael S. Tsirkin wrote:
> Quoting r. Roland Dreier <rolandd at cisco.com>:
> > Subject: Re: [PATCH] process locked in D state.
> >
> > Something like this should work...
>
> Note that there's a window when close() has returned but
> vm_locked isnt updated yet.
This is the case only if close() can return before workqueues are ran.
Is this possible? If yes perhaps it is better to use tascklets.
> Actually, a process that doesnt close memory regions before
> closing the device will leak (virtual) memory anyway.
This is not the kernel problem.
>
> The only interesting case is of the process exiting, in which case mm is
> NULL.
>
> So I propose to kill all this code.
By your patch you completely removed locked_vm accounting on deregister
path. I don't think this was your intention.
I think kernel should cleanup everything it can after close().
If user call ibv_close_device() he expects that the process is in
the state such as ibv_open_device() was never called in the first place.
>
> Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>
>
> Index: core/uverbs_mem.c
> ===================================================================
> --- core/uverbs_mem.c (revision 2724)
> +++ core/uverbs_mem.c (working copy)
> @@ -180,40 +180,7 @@ static void ib_umem_account(void *work_p
>
> void ib_umem_release(struct ib_device *dev, struct ib_umem *umem)
> {
> - struct mm_struct *mm;
> -
> + /* A well-behaved application will close regions before
> + * closing the file. Thus we dont have to update mm->locked_vm here. */
> __ib_umem_release(dev, umem, 1);
> -
> - mm = get_task_mm(current);
> - if (mm) {
> - /*
> - * We may be called with the mm's mmap_sem already
> - * held. This can happen when a userspace munmap() is
> - * the call that drops the last reference to our file
> - * and calls our release method. If there are memory
> - * regions to destroy, we'll end up here and not be
> - * able to take the mmap_sem.
> - *
> - * To handle this, we try to grab the mmap_sem, and if
> - * we can't get it immediately, we defer the
> - * accounting to the system workqueue.
> - */
> - if (down_write_trylock(&mm->mmap_sem)) {
> - mm->locked_vm -= PAGE_ALIGN(umem->length + umem->offset) >> PAGE_SHIFT;
> - up_write(&mm->mmap_sem);
> - mmput(mm);
> - } else {
> - struct ib_umem_account_work *work;
> -
> - work = kmalloc(sizeof *work, GFP_KERNEL);
> - if (!work)
> - return;
> -
> - INIT_WORK(&work->work, ib_umem_account, work);
> - work->mm = mm;
> - work->diff = PAGE_ALIGN(umem->length + umem->offset) >> PAGE_SHIFT;
> -
> - schedule_work(&work->work);
> - }
> - }
> }
>
> --
> MST
--
Gleb.
More information about the general
mailing list