[ofa-general] Re: New proposal for memory management
Barrett, Brian W
bwbarre at sandia.gov
Wed Apr 29 15:11:56 PDT 2009
On 4/29/09 15:55 , "Jason Gunthorpe" <jgunthorpe at obsidianresearch.com>
wrote:
>> The problem is that MPI needs to be aware of the application doing
>> the free() and unregister or flush its MR cache for that virtual
>> address range. Of course it would be difficult for OpenMPI to have
>> callbacks or hooks into every way memory could be allocated/freed
>> that an application might use.
>
> There are only three calls that affect the way VM memory maps to
> physical and thus would invalidate the mr cache: mmap, munmap and brk.
There's also System V shared memory, which at least one scientific code out
there uses.
> Specifically what must be happening is the app registers memory, calls
> munmap on it, then gets the same VA back from mmap and the kernel
> level mr is still pointing to the original mmap:
>
> foo = mmap(...);
> ibv_reg_mr(mr,foo)
> munmap(foo..)
> mmap(...) == foo; // By chance due to VA randomization
> // Ooops, mr no longer matches proc/self/maps
>
> Actually, maybe that is the simple answer here - have the kernel fixup
> the mr before returning from the 2nd mmap. Then the cache in user
> space is still correct to assume that VA XX is registered and working.
Yeah, although that could get really nasty as there's generally not one call
to ibv_reg_mr per call to mmap. It's usually a couple of calls to
ibv_reg_mr for different segments of the same mmap buffer (think sending
faces of a 3-d block of space to the nearest neighbors in a physics
simulation).
> Removing entries from the registration cache would have to be done in
> some other way (age?).
Brian
--
Brian W. Barrett
Dept. 1423: Scalable System Software
Sandia National Laboratories
More information about the general
mailing list