[ofa-general] Re: [PATCH 08 of 11] anon-vma-rwsem

Wed May 14 06:15:32 PDT 2008

On Tue, May 13, 2008 at 10:43:59PM -0700, Benjamin Herrenschmidt wrote:
> On Tue, 2008-05-13 at 22:14 +1000, Nick Piggin wrote:
> > ea.
> > 
> > I don't see why you're bending over so far backwards to accommodate
> > this GRU thing that we don't even have numbers for and could actually
> > potentially be batched up in other ways (eg. using mmu_gather or
> > mmu_gather-like idea).
> 
> I agree, we're better off generalizing the mmu_gather batching
> instead...

Unfortunately, we are at least several months away from being able to
provide numbers to justify batching - assuming it is really needed.  We need
large systems running real user workloads. I wish we had that available
right now, but we don't.

It also depends on what you mean by "no batching". If you mean that the
notifier gets called for each pte that is removed from the page table, then
the overhead is clearly very high for some operations. Consider the unmap of
a very large object. A TLB flush per page will be too costly.

However, something based on the mmu_gather seems like it should provide
exactly what is needed to do efficient flushing of the TLB. The GRU does not
require that it be called in a sleepable context. As long as the notifier
callout provides the mmu_gather and vaddr range being flushed, the GRU can
do the efficiently do the rest.

> 
> I had some never-finished patches to use the mmu_gather for pretty much
> everything except single page faults, tho various subtle differences
> between archs and lack of time caused me to let them take the dust and
> not finish them...
> 
> I can try to dig some of that out when I'm back from my current travel,
> though it's probably worth re-doing from scratch now.
> 
> Ben.
> 

-- jack