[ofa-general] Re: New proposal for memory management

Barrett, Brian W bwbarre at sandia.gov
Wed Apr 29 15:11:56 PDT 2009


On 4/29/09 15:55 , "Jason Gunthorpe" <jgunthorpe at obsidianresearch.com>
wrote:

>> The problem is that MPI needs to be aware of the application doing
>> the free() and unregister or flush its MR cache for that virtual
>> address range. Of course it would be difficult for OpenMPI to have
>> callbacks or hooks into every way memory could be allocated/freed
>> that an application might use.
> 
> There are only three calls that affect the way VM memory maps to
> physical and thus would invalidate the mr cache: mmap, munmap and brk.

There's also System V shared memory, which at least one scientific code out
there uses.

> Specifically what must be happening is the app registers memory, calls
> munmap on it, then gets the same VA back from mmap and the kernel
> level mr is still pointing to the original mmap:
> 
>  foo = mmap(...);
>  ibv_reg_mr(mr,foo)
>  munmap(foo..)
>  mmap(...) == foo; // By chance due to VA randomization
>  // Ooops, mr no longer matches proc/self/maps
> 
> Actually, maybe that is the simple answer here - have the kernel fixup
> the mr before returning from the 2nd mmap. Then the cache in user
> space is still correct to assume that VA XX is registered and working.

Yeah, although that could get really nasty as there's generally not one call
to ibv_reg_mr per call to mmap.  It's usually a couple of calls to
ibv_reg_mr for different segments of the same mmap buffer (think sending
faces of a 3-d block of space to the nearest neighbors in a physics
simulation).

> Removing entries from the registration cache would have to be done in
> some other way (age?).

Brian

--
   Brian W. Barrett
   Dept. 1423: Scalable System Software
   Sandia National Laboratories




More information about the general mailing list