[Openib-windows] Using of fast mutexes in WinIb

Thu Nov 17 06:31:30 PST 2005

SB

> -----Original Message-----
> From: Fab Tillier [mailto:ftillier at silverstorm.com]
> Sent: Wednesday, November 16, 2005 7:24 PM
> To: 'Leonid Keller'
> Cc: openib-windows at openib.org
> Subject: RE: [Openib-windows] Using of fast mutexes in WinIb
> 
> 
> Hi Leo,
> 
> > From: Leonid Keller [mailto:leonid at mellanox.co.il]
> > Sent: Tuesday, November 15, 2005 7:32 AM
> > 
> > Hi Fab,
> > I come across the following problem: implementation of
> > cl_mutex_acquire() via Fast Mutexes causes all the code in critical
> > section to work at APC_LEVEL.
> > 
> > The first case i saw, was at the start-up of IpoIb driver, 
> which takes
> > mutex in __ipoib_pnp_cb and makes all MTHCA driver control 
> verbs to work
> > at APC_LEVEL, which is troublesome.
> > (e.g., create_cq calls AllocateCommonBuffer, which requires
> > PASSIVE_LEVEL).
> 
> Why does create_cq call AllocateCommonBuffer?  Are you making multiple
> page-sized calls instead of a single larger call for the 
> cases where the memory
> required spans multiple pages?  Physically contiguous memory 
> is a scarce
> resource, so any time you can break up your requests into 
> page sized requests
> the better.

The algorithm of allocating tries to allocate one contiguous buffer.
If it fails, it requests lesser buffers. In the worst case it will
allocate N buffers 1 page size each.

I used AllocateCommonBuffer in the first time, because it returns bus
addresses.

One can implement that for the work at DISPATCH_LEVEL:
	va = MmAllocateContiguousMemorySpecifyCache(...);
	p_mdl = IoAllocateMdl( va, ...);
	MmBuildMdlForNonPagedPool( p_mdl );
	la = p_adapter->MapTransfer(adapter, p_mdl ...);

It has 2 little drawbacks as far as i see:
	1) MmAllocateContiguousMemorySpecifyCache allocates always an
integer number of pages;
	2) MapTransfer fails, when the number of map registers gets
exceeded. (But maybe AllocateCommonBuffer will also fail in this case)

What do you think ?

> 
> Is this for the CQE ring buffer?  Why not use regular memory? 
>  The memory has to
> be registered with the HCA anyway, right?
> 
> > There are several ways to solve the problem:
> > 	1) Implement cl_mutex_acquire() via regular mutexes. It is a
> > general and secure, but a bit ineffective way.
> > 	2) Add to complib "regular mutexes" and use them in that
> > function. It's more effective, but then we need  to 
> check/fix all other
> > uses of fast mutexes.
> 
> I don't think we should add a new mutex abstraction to 
> complib that only has
> meaning in the Windows kernel.  I'd much rather see the code 
> just make native NT
> calls for regular mutexes.
> 
> > 	3) Invent some other sync mechanism for that function. It lefts
> > the problem with fast mutexes in other functions.
> > 
> > What do you think ?
> 
> Can we change these on a case-by-case basis?  That is, change 
> the mutex in
> IPoIB's PnP callback to a regular mutex for now, and change 
> any others as we go
> forward?
> 

> Finding incorrect IRQL issues should be quick if your calls 
> assert that IRQL is
> correct.
> 
> BTW, have you had a chance to look over the new verb API that 

Not yet, sorry. Occupied by preparing the driver to the release.
I hope, i'll find time soon ...

> would allow calls
> at IRQL <= DISPATCH_LEVEL?  Any comments?  I'd really like to 
> get past these
> IRQL limitations.
> 
> - Fab
>