[ofa-general] New proposal for memory management

Jeff Squyres jsquyres at cisco.com
Fri May 1 06:26:18 PDT 2009


On Apr 30, 2009, at 7:20 PM, Aaron Fabbri (aafabbri) wrote:

>> Yes, MPI_ALLOC_MEM / MPI_FREE_MEM calls have been around for
>> a long time (~10 years?).  Using them does avoid many of the
>> problems that have been discussed.  Most (all?) MPI's either
>> support ALLOC_MEM / FREE_MEM by registering at allocation
>> time and unregistering at free time, or some variation of that.
>
> Ah.  Are there any problems that are not addressed by having MPI own  
> allocation of network bufs?

Sure, there's lots of them.  :-)  But this thread is just about the  
memory allocation management issues.

> (BTW registering for each allocation could be improved, I think.)

Probably so.  Since so few MPI applications use these calls, OMPI  
hasn't really bothered to tune them.

>> But unfortunately, very few MPI apps use these calls; they use
>> malloc() and friends instead.  Or they're written in Fortran,
>> where such concepts are not easily mapped (don't
>> underestimate how much Fortran MPI code runs on verbs!).
>> Indeed, in some layered scenarios, it's not easy to use these
>> calls (e.g., if an MPI-enabled computational library may
>> re-use user-provided buffers because they're so large, etc.).
>
> I understand the difficulty.  A couple possible counterpoints:
>
> 1. Make the next version of MPI spec *require* using the mpi_alloc
> atuff.

The MPI Forum (the standards body) has been very resistant to this,  
especially based on the requirements of one not-pervasive network  
stack.  It would effectively break all legacy MPI applications, too.   
I seriously doubt that the Forum would go for that.

FWIW: the way the MPI spec is worded, it says that you *may* get  
performance benefit from using MPI_ALLOC_MEM.  E.g., an MPI can always  
support using malloc buffers -- just copy into network-special  
buffers.  The performance would be terrible :-), but it would be  
correct.

> 2. MPI already requires recompilation of apps, right?  I don't know
> fortran, or what it uses for allocation, but worse case, maybe you  
> could
> change the standard libraries or compilers.

We tried that -- interposing our own copies of malloc, free, mmap, ...  
etc. (e.g., inside libmpi).  Ick.  Horrible, horrible ick.  And it  
definitely breaks some real-world apps and memory-checking debuggers/ 
tools.

> 3. Rip out your registration cache.  Make malloc'd buffers go really
> slow (register in fast path) and mpi_alloc_mem() buffers go really  
> fast.
> People will migrate.  The hard part of this would be getting all  
> MPIs to
> agree on this, I'm guessing.


See http://lists.openfabrics.org/pipermail/general/2009-May/ 
059376.html -- Open MPI effectively tried this and got beat up by a)  
competing MPI's, and b) the marketing supporting Open MPI.  :-\

People won't migrate, nor will main-line MPI benchmarks.  Customers  
want top performance out-of-the-box with their MPI (which is not  
unreasonable).  Users have used malloc() for 10+ years, and other  
networks don't require the use of MPI_ALLOC_MEM.

-- 
Jeff Squyres
Cisco Systems




More information about the general mailing list