[openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

Roland Dreier roland at topspin.com
Tue Apr 26 14:23:28 PDT 2005


    Andrew> Well I was vaguely proposing that the userspace library
    Andrew> keep track of the byteranges and the underlying page
    Andrew> states.  So in the above scenario userspace would leave
    Andrew> the page at 0x1000 registered until all registrations
    Andrew> against that page have been undone.

OK, I already have code in userspace that keeps reference counts for
overlapping regions, etc.  However I'm not sure how to tie this in
with reliable accounting of pinned memory -- we don't want malicious
userspace code to be able fool the accounting, right?

So I'm still trying to puzzle out what to do.  I don't want to keep a
complicated data structure in the kernel keeping track of what memory
has been registered.  Right now, I just keep a list of structs, one
for each region, and when a process dies, I just go through region by
region and do a put_page() to balance off the get_user_pages().

However I don't see how to make it work if I put the reference
counting for overlapping regions in userspace but when I want mlock()
accounting in the kernel.  If a buggy/malicious app does:

    a) register from 0x0000 to 0x2fff
    b) register from 0x1000 to 0x1fff
    c) unregister from 0x0000 to 0x2fff

then it seems the kernel is screwed unless it counts how many times a
vma has been pinned.  And adding a pin_count member to vm_struct seems
like a pretty damn major step.

We definitely have to make sure that userspace is never able to either
unpin a page that is still registered with RDMA hardware, because that
can lead to DMA to into memory that someone else owns.  On the other
hand, we don't want userspace to be able to defeat resource accounting
by tricking the kernel into keeping page_count elevated after it
credits the memory back to a process's limit on locked pages.

The limit on the number of locked pages seems like a natural thing to
check against, but perhaps we need a different limit for the number of
pages pinned for use by RDMA hardware.  Sort of the same way that
there's a separate limit on the number of in-flight aios.

 - R.




More information about the general mailing list