[openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

Roland Dreier roland at topspin.com
Mon Apr 25 17:02:36 PDT 2005


    Andrew> Whoa, hang on.

    Andrew> The way we expect get_user_pages() to be used is that the
    Andrew> kernel will use get_user_pages() once per application I/O
    Andrew> request.

    Andrew> Are you saying that RDMA clients will semi-permanently own
    Andrew> pages which were pinned by get_user_pages()?  That those
    Andrew> pages will be used for multiple separate I/O operations?

    Andrew> If so, then that's a significant design departure and it
    Andrew> would be good to hear why it is necessary.

The idea is that applications manage the lifetime of pinned memory
regions.  They can do things like post multiple I/O operations without
any page-walking overhead, or pass a buffer descriptor to a remote
host who will send data at some indeterminate time in the future.  In
addition, InfiniBand has the notion of atomic operations, so a cluster
application may be using some memory region to implement a global lock.

This might not be the most kernel-friendly design but it is pretty
deeply ingrained in the design of RDMA transports like InfiniBand and
iWARP (RDMA over IP).

I'm also not opposed to implementing some other mechanism to make this
work, but the combiniation of get_user_pages() in the kernel and
extending mprotect() to allow setting VM_DONTCOPY seems to work fine.

 - R.



More information about the general mailing list