[ofa-general] Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges
Christoph Lameter
clameter at sgi.com
Wed Feb 27 14:35:59 PST 2008
On Wed, 20 Feb 2008, Nick Piggin wrote:
> On Friday 15 February 2008 17:49, Christoph Lameter wrote:
> > The invalidation of address ranges in a mm_struct needs to be
> > performed when pages are removed or permissions etc change.
> >
> > If invalidate_range_begin() is called with locks held then we
> > pass a flag into invalidate_range() to indicate that no sleeping is
> > possible. Locks are only held for truncate and huge pages.
>
> You can't sleep inside rcu_read_lock()!
Could you be specific? This refers to page migration? Hmmm... Guess we
would need to inc the refcount there instead?
> I must say that for a patch that is up to v8 or whatever and is
> posted twice a week to such a big cc list, it is kind of slack to
> not even test it and expect other people to review it.
It was tested with the GRU and XPmem. Andrea also reported success.
> Also, what we are going to need here are not skeleton drivers
> that just do all the *easy* bits (of registering their callbacks),
> but actual fully working examples that do everything that any
> real driver will need to do. If not for the sanity of the driver
> writer, then for the sanity of the VM developers (I don't want
> to have to understand xpmem or infiniband in order to understand
> how the VM works).
There are 3 different drivers that can already use it but the code is
complex and not easy to review. Skeletons are easy to allow people to get
started with it.
> > lru_add_drain();
> > tlb = tlb_gather_mmu(mm, 0);
> > update_hiwater_rss(mm);
> > + mmu_notifier(invalidate_range_begin, mm, address, end, atomic);
> > end = unmap_vmas(&tlb, vma, address, end, &nr_accounted, details);
> > if (tlb)
> > tlb_finish_mmu(tlb, address, end);
> > + mmu_notifier(invalidate_range_end, mm, address, end, atomic);
> > return end;
> > }
> >
>
> Where do you invalidate for munmap()?
zap_page_range() called from unmap_vmas().
> Also, how to you resolve the case where you are not allowed to sleep?
> I would have thought either you have to handle it, in which case nobody
> needs to sleep; or you can't handle it, in which case the code is
> broken.
That can be done in a variety of ways:
1. Change VM locking
2. Not handle file backed mappings (XPmem could work mostly in such a
config)
3. Keep the refcount elevated until pages are freed in another execution
context.
More information about the general
mailing list