[openib-general] Re: madvise MADV_DONTFORK/MADV_DOFORK
Hugh Dickins
hugh at veritas.com
Mon Feb 13 10:23:36 PST 2006
On Mon, 13 Feb 2006, Michael S. Tsirkin wrote:
> OK, I guess its time to start the push for merging this patch.
> Probably not 2.6.16 material, but it would be nice to get this say into -mm to
> make it easier to test this. Tested on x86_64 only.
>
> Please Cc me directly with comments, I'm not on the list.
>
> ---
>
> Add madvise options to control whether memory range is inherited across fork.
> Useful e.g. for when hardware is doing DMA from/into these pages.
>
> Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>
Looks good to me, Michael (but Gleb's eye has always proved better than
mine). Just a couple of adjustments I'd ask before you send to Andrew:
1. Please just drop your mm/mmap.c vm_stat_account() mod:
> if (flags & VM_HUGETLB) {
> - if (!(flags & VM_DONTCOPY))
> + if (!(flags & (VM_DONTCOPY|VM_DONTFORK)))
> mm->shared_vm += pages;
Conscientious of you to include that, but (a) if it's right, then you'd
need to be fiddling shared_vm up and down whenever madvise changes
VM_DONTFORK, and none us much want to get into that; and (b) I cannot
for the life of me work out what that VM_HUGETLB VM_DONTCOPY block is
about in the first place - I can't even find any instance of VM_HUGETLB
with VM_DONTCOPY to judge it by. Luckily, wli CC'ed is the expert on
both hugetlb and vm_stat_account - I hope he'll just tell us that block
is wrong and should be deleted (which he or I could do as an unrelated
patch). Perhaps it was an inappropriate hack to prevent some count
going negative, from the days when we forgot to correct total_vm in
the VM_DONTCOPY case in dup_mmap.
2. Your two-line changeset comment should be expanded: mention Infiniband,
mention get_user_pages, explain how frustrating it is for the carefully
pinned page to be orphaned from its user address space by a stray Copy-
On-Write, if the process happens to fork meanwhile; and how VM_DONTFORK
can be used to secure areas against that possibility. Mention how it
could also be useful to an application, wanting to speed up its forks
by cutting large areas out of consideration. Some of that information
will be useful to Michael Kerrisk when he updates the madvise man page.
Explain that MADV_DONTFORK should be reversible, hence MADV_DOFORK;
but should not be reversible on areas a driver has so marked, hence
VM_DONTFORK distinct from VM_DONTCOPY.
Thanks,
Hugh
More information about the general
mailing list