[ofa-general] Re: [PATCH 08 of 11] anon-vma-rwsem

Linus Torvalds torvalds at linux-foundation.org
Wed May 7 14:36:57 PDT 2008



On Wed, 7 May 2008, Andrea Arcangeli wrote:
> 
> I think the spinlock->rwsem conversion is ok under config option, as
> you can see I complained myself to various of those patches and I'll
> take care they're in a mergeable state the moment I submit them. What
> XPMEM requires are different semantics for the methods, and we never
> had to do any blocking I/O during vmtruncate before, now we have to.

I really suspect we don't really have to, and that it would be better to 
just fix the code that does that.

> Please ignore all patches but mmu-notifier-core. I regularly forward
> _only_ mmu-notifier-core to Andrew, that's the only one that is in
> merge-ready status, everything else is just so XPMEM can test and we
> can keep discussing it to bring it in a mergeable state like
> mmu-notifier-core already is.

The thing is, I didn't like that one *either*. I thought it was the 
biggest turd in the series (and by "biggest", I literally mean "most lines 
of turd-ness" rather than necessarily "ugliest per se").

I literally think that mm_lock() is an unbelievable piece of utter and 
horrible CRAP.

There's simply no excuse for code like that.

If you want to avoid the deadlock from taking multiple locks in order, but 
there is really just a single operation that needs it, there's a really 
really simple solution.

And that solution is *not* to sort the whole damn f*cking list in a 
vmalloc'ed data structure prior to locking!

Damn.

No, the simple solution is to just make up a whole new upper-level lock, 
and get that lock *first*. You can then take all the multiple locks at a 
lower level in any order you damn well please. 

And yes, it's one more lock, and yes, it serializes stuff, but:

 - that code had better not be critical anyway, because if it was, then 
   the whole "vmalloc+sort+lock+vunmap" sh*t was wrong _anyway_

 - parallelism is overrated: it doesn't matter one effing _whit_ if 
   something is a hundred times more parallel, if it's also a hundred 
   times *SLOWER*.

So dang it, flush the whole damn series down the toilet and either forget 
the thing entirely, or re-do it sanely.

And here's an admission that I lied: it wasn't *all* clearly crap. I did 
like one part, namely list_del_init_rcu(), but that one should have been 
in a separate patch. I'll happily apply that one.

		Linus



More information about the general mailing list