[openib-general] Question about pinning memory

Jeff Squyres jsquyres at open-mpi.org
Fri Jul 22 16:04:10 PDT 2005


Greetings.

I have a question about pinning/registering memory.  A little 
background first...

This is a common strategy for MPI implementations: when MPI gets a 
user-choice buffer to send or receive, it looks up in its internal 
tables to see if it's not already pinned.  If it isn't, the memory is 
pinned, entered in the lookup tables, and then we proceed.  The next 
time through, the memory will already be pinned, and we can avoid the 
expensive pinning operation.  If we ever fail to pin memory, we can try 
unpinning some old/unused memory (using some MRU / LRU kinds of lookups 
on the tables to find unused memory) and then try pinning again.

Related to the scenario above, here's my questions:

1. What happens if pinned memory is returned to the OS via sbrk()?  Do 
Bad Things happen, or does the kernel portion of Open IB handle this 
gracefully?

(if anyone knows what mVAPI does here, that would also be really useful 
to know)

2. What happens if the MPI unpins memory that is already out of the 
process?  Do Bad Things happen, or does Open IB handle this gracefully?

(if anyone knows what mVAPI does here, that would also be really useful 
to know)

3. Even the scenario above is not enough for the MPI to properly handle 
the memory pinning issues.  Since MPI usually does not have visibility 
for memory that has been free'd (or sbrk'ed), the following scenario is 
possible (and in our experience, likely):

- user calls malloc() and gets virtual address A back
- user calls MPI_Send with A
- MPI pins address A
- MPI saves A in internal lookup tables
- MPI_Send returns
- user calls free(A)
- user calls malloc() and gets virtual address A back, but now A points 
to different physical memory than before
- user calls MPI_Send with A
- MPI thinks that A is pinned (because A is in MPI's internal tables), 
and therefore doesn't pin it
- MPI tries to send A with the IB API, and Bad Things occur (either an 
error, or worse, bad data is sent)

What would be *really* great (from MPI's perspective) is if Open IB can 
know when pinned memory is removed from the process (probably via 
sbrk()) and give the MPI implementation a callback somehow when this 
happens.

Is this possible?

Otherwise, the scenario in question #3 is a real problem.  There are a 
few possibilities for fixing it, but all are problematic (override 
sbrk() via including ptmalloc2 in the distribution, using LD_PRELOAD to 
override sbrk(), etc.).  Any other suggestions would be welcome...

Thanks for your time.

-- 
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/




More information about the general mailing list