[ofa-general] Re: Demand paging for memory regions

Robin Holt holt at sgi.com
Fri Feb 15 01:55:48 PST 2008


On Fri, Feb 15, 2008 at 07:47:06AM +1100, David Singleton wrote:
> Caitlin Bestler wrote:
>> But the broader question is what the goal is here. Allowing memory to
>> be shuffled is valuable, and perhaps even ultimately a requirement for
>> high availability systems. RDMA and other direct-access APIs should
>> be evolving their interfaces to accommodate these needs.
>> Oversubscribing memory is a totally different matter. If an application
>> is working with memory that is oversubscribed by a factor of 2 or more
>> can it really benefit from zero-copy direct placement? At first glance I
>> can't see what RDMA could be bringing of value when the overhead of
>> swapping is going to be that large.
>
> A related use case from HPC.  Some of us have batch scheduling
> systems based on suspend/resume of jobs (which is really just
> SIGSTOP and SIGCONT of all job processes).  The value of this
> system is enhanced greatly by being able to page out the suspended
> job (just normal Linux demand paging caused by the incoming job is
> OK).  Apart from this (relatively) brief period of paging, both
> jobs benefit from RDMA.
>
> SGI kindly implemented a /proc mechanism for unpinning of XPMEM
> pages to allow suspended jobs to be paged on their Altix system.
>
> Note that this use case would not benefit from Pete Wyckoff's
> approach of notifying user applications/libraries of VM changes.

We will be implementing xpmem on top of mmu_notifiers (actively working
on that now) so in that case, you would no longer need to use the
/proc/xpmem/<pid> mechanism for unpinning.  Hopefully, we will have xpmem
in before 2.6.26 and get it into the base OS now instead of an add-on.
Oh yeah, and memory migration will not need the unpin thing either so
you can move smaller jobs around more easily.

Thanks,
Robin



More information about the general mailing list