[openib-general] Question about pinning memory
Jeff Squyres
jsquyres at open-mpi.org
Fri Jul 22 16:04:10 PDT 2005
Greetings.
I have a question about pinning/registering memory. A little
background first...
This is a common strategy for MPI implementations: when MPI gets a
user-choice buffer to send or receive, it looks up in its internal
tables to see if it's not already pinned. If it isn't, the memory is
pinned, entered in the lookup tables, and then we proceed. The next
time through, the memory will already be pinned, and we can avoid the
expensive pinning operation. If we ever fail to pin memory, we can try
unpinning some old/unused memory (using some MRU / LRU kinds of lookups
on the tables to find unused memory) and then try pinning again.
Related to the scenario above, here's my questions:
1. What happens if pinned memory is returned to the OS via sbrk()? Do
Bad Things happen, or does the kernel portion of Open IB handle this
gracefully?
(if anyone knows what mVAPI does here, that would also be really useful
to know)
2. What happens if the MPI unpins memory that is already out of the
process? Do Bad Things happen, or does Open IB handle this gracefully?
(if anyone knows what mVAPI does here, that would also be really useful
to know)
3. Even the scenario above is not enough for the MPI to properly handle
the memory pinning issues. Since MPI usually does not have visibility
for memory that has been free'd (or sbrk'ed), the following scenario is
possible (and in our experience, likely):
- user calls malloc() and gets virtual address A back
- user calls MPI_Send with A
- MPI pins address A
- MPI saves A in internal lookup tables
- MPI_Send returns
- user calls free(A)
- user calls malloc() and gets virtual address A back, but now A points
to different physical memory than before
- user calls MPI_Send with A
- MPI thinks that A is pinned (because A is in MPI's internal tables),
and therefore doesn't pin it
- MPI tries to send A with the IB API, and Bad Things occur (either an
error, or worse, bad data is sent)
What would be *really* great (from MPI's perspective) is if Open IB can
know when pinned memory is removed from the process (probably via
sbrk()) and give the MPI implementation a callback somehow when this
happens.
Is this possible?
Otherwise, the scenario in question #3 is a real problem. There are a
few possibilities for fixing it, but all are problematic (override
sbrk() via including ptmalloc2 in the distribution, using LD_PRELOAD to
override sbrk(), etc.). Any other suggestions would be welcome...
Thanks for your time.
--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/
More information about the general
mailing list