[ofa-general] Re: [PATCH 2/3] libmlx4 - Optimize memory allocation of QP buffers with 64K pages

Wed May 20 04:39:06 PDT 2009

On Wed, 20 May 2009 14:10:36 +0300
Eli Cohen <eli at dev.mellanox.co.il> wrote:

> On Wed, May 20, 2009 at 08:00:47AM +0200, sebastien dugue wrote:
> > 
> >   Well not really, because if we stay below MMAP_THRESHOLD, as we do
> > with 4K pages, the only overhead is malloc's chaining structure. The
> > extra space used to align the buffer is released before posix_memalign()
> > returns, but that does increase fragmentation of mallocs chunks.
> > 
> >   Also, for 4K pages, mmap() systematically results in a syscall whereas
> > posix_memalign() does not necessarily, but as we're not on a fast path
> > I'm not sure what would be best. I don't mind converting all QP buffers
> > allocation to mmap(), but I'd like to hear what people think.
> > 
> 
> If the only reasoning behind using a MMAP_THRESHOLD is to avoid the
> system call for smaller allocations,

  Well, that's not the only reason. From what I understand, for small
allocations, glibc's malloc can recycle freed heap chunks much more easily
than mmapped chunks. Also the mmapped chunk must be zeroed by the kernel
before being handed to the user which does not comes for free.

> then I think we'd better use a
> uniform allocation scheme -- mmap -- as you proposed and not
> distinguish between the two cases.
> 

  I will respin those patches early next week if nobody disagrees with
this route.

  Thanks,

  Sebastien.