[ofa-general] New proposal for memory management

Thu Apr 30 16:20:26 PDT 2009

> -----Original Message-----
> From: Jeff Squyres (jsquyres) 
> Sent: Thursday, April 30, 2009 6:19 AM
> To: Aaron Fabbri (aafabbri)
> Cc: general at lists.openfabrics.org
> Subject: Re: [ofa-general] New proposal for memory management
> 
> On Apr 30, 2009, at 1:30 AM, Aaron Fabbri (aafabbri) wrote:
> 
> > Have you considered changing the MPI API to require applications to 
> > use MPI to allocate any/all buffers that may be used for 
> network I/O?  
> > That is, instead of calling malloc() et al., call a new 
> mpi_malloc() 
> > which allocates from pre- registered memory.
> 
> 
> Yes, MPI_ALLOC_MEM / MPI_FREE_MEM calls have been around for 
> a long time (~10 years?).  Using them does avoid many of the 
> problems that have been discussed.  Most (all?) MPI's either 
> support ALLOC_MEM / FREE_MEM by registering at allocation 
> time and unregistering at free time, or some variation of that.
>

Ah.  Are there any problems that are not addressed by having MPI own
allocation of network bufs?

(BTW registering for each allocation could be improved, I think.)

> But unfortunately, very few MPI apps use these calls; they use
> malloc() and friends instead.  Or they're written in Fortran, 
> where such concepts are not easily mapped (don't 
> underestimate how much Fortran MPI code runs on verbs!).  
> Indeed, in some layered scenarios, it's not easy to use these 
> calls (e.g., if an MPI-enabled computational library may 
> re-use user-provided buffers because they're so large, etc.).

I understand the difficulty.  A couple possible counterpoints:

1. Make the next version of MPI spec *require* using the mpi_alloc
atuff.

2. MPI already requires recompilation of apps, right?  I don't know
fortran, or what it uses for allocation, but worse case, maybe you could
change the standard libraries or compilers.

3. Rip out your registration cache.  Make malloc'd buffers go really
slow (register in fast path) and mpi_alloc_mem() buffers go really fast.
People will migrate.  The hard part of this would be getting all MPIs to
agree on this, I'm guessing.

Aaron