[ofa-general] Re: New proposal for memory management
Ralph Campbell
ralph.campbell at qlogic.com
Wed Apr 29 15:28:00 PDT 2009
On Wed, 2009-04-29 at 15:21 -0700, Jason Gunthorpe wrote:
> On Wed, Apr 29, 2009 at 04:11:56PM -0600, Barrett, Brian W wrote:
> > On 4/29/09 15:55 , "Jason Gunthorpe" <jgunthorpe at obsidianresearch.com>
> > wrote:
> >
> > >> The problem is that MPI needs to be aware of the application doing
> > >> the free() and unregister or flush its MR cache for that virtual
> > >> address range. Of course it would be difficult for OpenMPI to have
> > >> callbacks or hooks into every way memory could be allocated/freed
> > >> that an application might use.
> > >
> > > There are only three calls that affect the way VM memory maps to
> > > physical and thus would invalidate the mr cache: mmap, munmap and brk.
> >
> > There's also System V shared memory, which at least one scientific code out
> > there uses.
>
> People use that stuff? Yuk, toxic. :)
>
> > Yeah, although that could get really nasty as there's generally not one call
> > to ibv_reg_mr per call to mmap. It's usually a couple of calls to
> > ibv_reg_mr for different segments of the same mmap buffer (think sending
> > faces of a 3-d block of space to the nearest neighbors in a physics
> > simulation).
>
> Plus you have to be careful if VA randomization creates holes, ie you
> might have a MR registration covering 1GB that got munmapped but after
> a while you have a dozen fragmented mmaps in that same space.
>
> A 3rd alternative would be to make mmap not return VA's that are still
> registered with IB. Then on munmap you are assured to never get that
> address back until you call ibv_mem_unreg. From time to time MPI can
> inspect proc/self/maps and remove cached registrations that have no VM
> address.
>
> Jason
Besides, mmap() only allocates a virtual address range in the user's
address space. It doesn't fault in all the pages into physical memory.
That happens when the application tries to read or write memory in
the VA range of the mmap. The IB memory registrations need physical
addresses and it would be impractical to do this for every mmap or
brk.
More information about the general
mailing list