[ofa-general] Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

Fri Feb 29 14:41:44 PST 2008

On Fri, Feb 29, 2008 at 02:12:57PM -0800, Christoph Lameter wrote:
> On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
> 
> > > AFAICT The rw semaphore fastpath is similar in performance to a rw 
> > > spinlock. 
> > 
> > read side is taken in the slow path.
> 
> Slowpath meaning VM slowpath or lock slow path? Its seems that the rwsem 

With slow path I meant the VM. Sorry if that was confusing given locks
also have fast paths (no contention) and slow paths (contention).

> read side path is pretty efficient:

Yes. The assembly doesn't worry me at all.

> > pagefault is fast path, VM during swapping is slow path.
> 
> Not sure what you are saying here. A pagefault should be considered as a 
> fast path and swapping is not performance critical?

Yes, swapping is I/O bound and it rarely becomes CPU hog in the common
case.

There are corner case workloads (including OOM) where swapping can
become cpu bound (that's also where rwlock helps). But certainly the
speed of fork() and a page fault, is critical for _everyone_, not just
a few workloads and setups.

> Ok too many calls to schedule() because the slow path (of the semaphore) 
> is taken?

Yes, that's the only possible worry when converting a spinlock to
mutex.

> But that is only happening for the contended case. Certainly a spinlock is 
> better for 2p system but the more processors content for the lock (and 
> the longer the hold off is, typical for the processors with 4p or 8p or 
> more) the better a semaphore will work.

Sure. That's also why the PT lock switches for >4way compiles. Config
option helps to keep the VM optimal for everyone. Here it is possible
it won't be necessary but I can't be sure given both i_mmap_lock and
anon-vma lock are used in some many places. Some TPC comparison would
be nice before making a default switch IMHO.