[ofw] RE: [HW] memory allocation improvement in user space

Leonid Keller leonid at mellanox.co.il
Sun Jun 29 09:05:31 PDT 2008


Seems like it wasn't sent for some reason.
Resending ... 

> -----Original Message-----
> From: Leonid Keller 
> Sent: Tuesday, June 24, 2008 1:14 PM
> To: 'Fab Tillier'; ofw at lists.openfabrics.org
> Subject: RE: [HW] memory allocation improvement in user space
> 
> See inline 
> 
> > -----Original Message-----
> > From: Fab Tillier [mailto:ftillier at windows.microsoft.com]
> > Sent: Monday, June 23, 2008 8:55 PM
> > To: Leonid Keller; ofw at lists.openfabrics.org
> > Subject: RE: [HW] memory allocation improvement in user space
> > 
> > Hi Leo,
> > 
> > >From: Leonid Keller
> > >Sent: Monday, June 23, 2008 6:44 AM
> > >
> > >Investigation of INSUFFICIENT_MEMORY failures on our stress tests 
> > >brought us to "revelation", that VirtualAlloc function, used for 
> > >implementation of posix_memalign, is a very "greedy" one: it
> > allocates
> > >at least 64KB memory.
> > >As far as we usually ask one page, it is 16 times more than
> > necessary.
> > 
> > VirtualAlloc doesn't actually allocate 64K of memory, but it does 
> > reserve 64K of virtual address space.  It then commits a 
> single page.  
> > This article:
> > http://msdn.microsoft.com/en-us/library/ms810627.aspx
> > explains this behavior: "The minimum size that can be 
> reserved is 64K 
> > ... Requesting one page of reserved addresses results in a 
> 64K address 
> > range" in the section about reserving memory.
> 
> I know that. But using reservation and allocation separately 
> requires a complicate housekeeping system for one and we do 
> not always need allocate integral pages for two.
> > 
> > >Presented below a patch, which implements posix_memalign with
> > >(ultimately) HeapAlloc functions.
> > >The patch was tested and worked OK.
> > >
> > >An important nuance, that was revealed during testing is 
> as follows:
> > >A system function, which releases the resources of an
> > exiting process,
> > >damages in some way the work of MmSecureVirtualMemory
> > function, which
> > >we use today to secure CQ\QP\SRQ circular buffers and user buffers.
> > >If an application gets killed or exits without releasing the
> > resources,
> > >IBBUS catches this event, starts its cascading destroy of
> > resources and
> > >crashes on MmUnsecureVirtualMemory.
> > >Putting MmUnsecureVirtualMemory in try-except block saves from the 
> > >crash, but an async thread, releasing QPs, freezes on 
> > >MmUnsecureVirtualMemory, which fails to get some mutex.
> > 
> > Maybe the first exception didn't release the mutex?
> 
> Maybe. It's not my mutex. It's theirs, internal one.
> The only change that I did (in user space) was to allocate 
> memory in the Heap and not as separate pages.
> MmSecureVirtualMemory/MmUnsecureVirtualMemory pair works on 
> the good flow, when the resources are released.
> MmUnsecureVirtualMemory fails with exception only when 
> application exits without releasing the resources.
> 
> > 
> > >As far as there is no real reason to secure circular  
> buffers, i've 
> > >solved the problem by skipping securing for IB objects.
> > >User buffers are still secured while memory registration.
> > >
> 
> BTW, DDK *doesn't* recommend to use this function.
> Its intention is to keep a user buffer from freeing or 
> changing access rights.
> We do not need this functionality for circular buffers anyway.
> 
> > >Therefore the patch contains 3 kinds of changes:
> > >1) new implementation of posix_memalign and all related to that;
> > >2) try-except block around MmUnsecureVirtualMemory;
> > 
> > I'd be weary of just working around the issue without understanding 
> > why it crashes, especially since you found some negative 
> ramifications 
> > to this (the subsequent hang).  What if there's some memory 
> corruption 
> > problem in the UVP - shouldn't that be fixed rather than hidden?
> > 
> > >3) new parameter in ib_umem_get and mthca_reg_virt_mr for skipping 
> > >memory securing and all related to it;
> > 
> > -Fab
> > 



More information about the ofw mailing list