[openib-general] Question about pinning memory

Jeff Squyres jsquyres at open-mpi.org
Sun Jul 24 05:20:42 PDT 2005


On Jul 24, 2005, at 3:39 AM, Gleb Natapov wrote:

>> 1. What happens if pinned memory is returned to the OS via sbrk()?  Do
>> Bad Things happen, or does the kernel portion of Open IB handle this
>> gracefully?
>>
> Kernel portion of OpenIB pins physical pages. If you unmap virtual
> address range from process address space using sbrk or munmap they will
> stay pinned until deregistered.

Interesting.  So Open IB doesn't get a notification upon 
unmapping/sbrk'ing?

>> 2. What happens if the MPI unpins memory that is already out of the
>> process?  Do Bad Things happen, or does Open IB handle this 
>> gracefully?
>>
> OpenIB should handle this case. It looks up pinned physical pages using
> memory handle returned on register and unpins them. It doesn't scan 
> user
> virtual space during dereg.

Also interesting.  So it's "ok" to use this strategy (maintain MRU/LRU 
kinds of tables and unpin upon demand), even though you may be 
unpinning memory that no longer belongs to your process.  That seems 
pretty weird -- it feels like breaking the POSIX process abstraction 
barrier.

For example, it could be inefficient if there are multiple processes 
using IB on a single node -- pinned memory can consume pinning 
resources even though it's no longer in your process.  If too much 
memory falls into this category, this can even cause general virtual 
memory performance degradation (i.e., nothing to do with IB 
communication at all).

Is this the intended behavior?

I actually don't know how IB is used in non-HPC environments; perhaps 
MPI is somewhat unique in that it doesn't have visibility of the 
alloc/free behavior of its user applications (well, actually, MPI does 
have MPI_ALLOC_MEM and MPI_FREE_MEM, but most apps unfortunately don't 
use these functions :-( ).  I can see the argument that [non-HPC] IB 
apps that don't unpin before removing memory from a process could be 
considered an erroneous program.  But that doesn't help my MPI problem. 
  :-)

A derivative question: what happens in this scenario:

1. process A pins memory Z
2. process A sbrk()'s Z without unpinning it first
3. process B obtains memory Z
4. process B pins memory Z
5. process A unpins memory Z

There's at least 2 points in that scenario where Badness can occur (4 
and 5).  Is there cross-process-awareness in the kernel tables that 
will probably handle these situations?  (or are the tables down in the 
kernel process-independent such that simple mechanisms such as 
reference counting are sufficient?)

-----

Is there any thought within the IB community to getting rid of this 
whole memory registration issue at the user level?  Perhaps a la what 
Quadrics did (make the driver understand virtual memory) or what Va 
Tech did (put in their own kernel-level hooks to make malloc/free to do 
the Right Things)?  Handling all this memory registration stuff is 
certainly quite a big chunk of code that every MPI (and IB app) needs 
to handle.  Could this stuff be moved down into the realm of the IB 
stack itself?  From an abstraction / software architecture standpoint, 
it could make sense to unify this handling in one place rather than N 
(MPI implementations and other IB apps).

Bottom line: from an MPI implementor standpoint, it would be great if 
we didn't have to worry about this stuff at all -- that the IB stack 
just transparently (and efficiently) handled all of it.  :-)

(I realize that this is probably asking for quite a lot -- but hey, no 
discussion will ever happen unless someone asks, right?)

-- 
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/




More information about the general mailing list