[ofa-general] Re: Demand paging for memory regions

David Singleton David.Singleton at anu.edu.au
Thu Feb 14 12:47:06 PST 2008


Caitlin Bestler wrote:
> 
> But the broader question is what the goal is here. Allowing memory to
> be shuffled is valuable, and perhaps even ultimately a requirement for
> high availability systems. RDMA and other direct-access APIs should
> be evolving their interfaces to accommodate these needs.
> 
> Oversubscribing memory is a totally different matter. If an application
> is working with memory that is oversubscribed by a factor of 2 or more
> can it really benefit from zero-copy direct placement? At first glance I
> can't see what RDMA could be bringing of value when the overhead of
> swapping is going to be that large.
> 

A related use case from HPC.  Some of us have batch scheduling
systems based on suspend/resume of jobs (which is really just
SIGSTOP and SIGCONT of all job processes).  The value of this
system is enhanced greatly by being able to page out the suspended
job (just normal Linux demand paging caused by the incoming job is
OK).  Apart from this (relatively) brief period of paging, both
jobs benefit from RDMA.

SGI kindly implemented a /proc mechanism for unpinning of XPMEM
pages to allow suspended jobs to be paged on their Altix system.

Note that this use case would not benefit from Pete Wyckoff's
approach of notifying user applications/libraries of VM changes.

And one of the grand goal of HPC developers has always been to have
checkpoint/restart of jobs ....

David



More information about the general mailing list