[ofa-general] Re: [PATCH 08 of 11] anon-vma-rwsem
Robin Holt
holt at sgi.com
Wed May 14 09:22:24 PDT 2008
On Wed, May 14, 2008 at 08:18:21AM -0700, Linus Torvalds wrote:
>
>
> On Wed, 14 May 2008, Robin Holt wrote:
> >
> > Are you suggesting the sending side would not need to sleep or the
> > receiving side?
>
> One thing to realize is that most of the time (read: pretty much *always*)
> when we have the problem of wanting to sleep inside a spinlock, the
> solution is actually to just move the sleeping to outside the lock, and
> then have something else that serializes things.
>
> That way, the core code (protected by the spinlock, and in all the hot
> paths) doesn't sleep, but the special case code (that wants to sleep) can
> have some other model of serialization that allows sleeping, and that
> includes as a small part the spinlocked region.
>
> I do not know how XPMEM actually works, or how you use it, but it
> seriously sounds like that is how things *should* work. And yes, that
> probably means that the mmu-notifiers as they are now are simply not
> workable: they'd need to be moved up so that they are inside the mmap
> semaphore but not the spinlocks.
We are in the process of attempting this now. Unfortunately for SGI,
Christoph is on vacation right now so we have been trying to work it
internally.
We are looking through two possible methods, one we add a callout to the
tlb flush paths for both the mmu_gather and flush_tlb_page locations.
The other we place a specific callout seperate from the gather callouts
in the paths we are concerned with. We will look at both more carefully
before posting.
In either implementation, not all call paths would require the stall
to ensure data integrity. Would it be acceptable to always put a
sleepable stall in even if the code path did not require the pages be
unwritable prior to continuing? If we did that, I would be freed from
having a pool of invalidate threads ready for XPMEM to use for that work.
Maybe there is a better way, but the sleeping requirement we would have
on the threads make most options seem unworkable.
Thanks,
Robin
More information about the general
mailing list