[openib-general] get_user_pages() vs. sys_mlock() and 2.6 kernel
Tom Duffy
tduffy at sun.com
Tue Sep 28 17:27:43 PDT 2004
On Tue, 2004-09-28 at 17:20 -0700, Libor Michalek wrote:
> > B) post to the linux-kernel mailing list rather than openib-general
>
> However, I'd be interested if this list was CC'd, since it is very
> applicable to all the zero-copy userspace code.
Too late. He cross posted lkml, linux-mm, and kernelnewbies:
> I was hoping that this bug would be fixed in the 2.6 kernels, but
> apparently it hasn't been.
>
> Function get_user_pages() is supposed to lock user memory. However,
> under extreme memory constraints, the kernel will swap out the
> "locked"
> memory.
>
> I have a test app which does this:
>
> 1) Calls our driver, which issues a get_user_pages() call for one
> page.
> 2) Calls our driver again to get the physical address of that page
> (the
> driver uses pgd/pmd/pte_offset).
> 3) Tries allocate 1GB of memory (this system has 1GB of physical RAM).
> 4) Tries to get the physical address again.
>
> In step 4, the physical address is usually zero, which means either
> pgd_offset or pmd_offset failed. This indicates the page was swapped
> out.
>
> I don't understand how this bug can continue to exist after all this
> time. get_user_pages() is supposed to lock the memory, because
> drivers
> use it for DMA'ing directly into user memory.
So far, Christoph Hellwig said:
> get_user_pages locks the page in memory. It doesn't do anything about
> ptes.
And Dave Hansen responded:
> You probably want mlock(2) to keep the kernel from messing with the
> ptes
> at all. But, you should probably really be thinking about why you're
> accessing the page tables at all. I count *ONE* instance in drivers/
> where page tables are accessed directly.
Not very helpful...
-tduffy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20040928/b58fdf5c/attachment.sig>
More information about the general
mailing list