[ofa-general] Re: [PATCH 08 of 11] anon-vma-rwsem

Andrea Arcangeli andrea at qumranet.com
Fri May 9 11:55:53 PDT 2008


On Fri, May 09, 2008 at 08:37:29PM +0200, Peter Zijlstra wrote:
> Another possibility, would something like this work?
> 
>  
>  /*
>   * null out the begin function, no new begin calls can be made
>   */
>  rcu_assing_pointer(my_notifier.invalidate_start_begin, NULL); 
> 
>  /*
>   * lock/unlock all rmap locks in any order - this ensures that any
>   * pending start() will have its end() function called.
>   */
>  mm_barrier(mm);
> 
>  /*
>   * now that no new start() call can be made and all start()/end() pairs
>   * are complete we can remove the notifier.
>   */
>  mmu_notifier_remove(mm, my_notifier);
> 
> 
> This requires a mmu_notifier instance per attached mm and that
> __mmu_notifier_invalidate_range_start() uses rcu_dereference() to obtain
> the function.
> 
> But I think its enough to ensure that:
> 
>   for each start an end will be called

We don't need that, it's perfectly ok if start is called but end is
not, it's ok to unregister in the middle as I guarantee ->release is
called before mmu_notifier_unregister returns (if ->release is needed
at all, not the case for KVM/GRU).

Unregister is already solved with srcu/rcu without any additional
complication as we don't need the guarantee that for each start an end
will be called.

> It can however happen that end is called without start - but we could
> handle that I think.

The only reason mm_lock() was introduced is to solve "register", to
guarantee that for each end there was a start. We can't handle end
called without start in the driver.

The reason the driver must be prevented to register in the middle of
start/end, if that if it ever happens the driver has no way to know it
must stop the secondary mmu page faults to call get_user_pages and
instantiate sptes/secondarytlbs on pages that will be freed as soon as
zap_page_range starts.



More information about the general mailing list