[openib-general] Re: [PATCH repost] PROT_DONTCOPY: ifiniband uverbs fork support

Gleb Natapov glebn at voltaire.com
Wed Aug 10 06:26:12 PDT 2005


On Wed, Aug 10, 2005 at 02:22:40PM +0100, Hugh Dickins wrote:
> On Wed, 10 Aug 2005, Gleb Natapov wrote:
> > On Tue, Aug 09, 2005 at 07:13:33PM +0100, Hugh Dickins wrote:
> > > Even more I'd prefer one of these two solutions below, which sidestep
> > > that uncleanliness - but both of these would be in mmap only, no clean
> > > way to change afterwards (except by munmap or mmap MAP_FIXED):
> > > 
> > > 1.  Use the standard mmap(NULL, len, PROT_READ|PROT_WRITE,
> > >     MAP_SHARED|MAP_ANONYMOUS, -1, 0) which gives you a memory object
> > >     shared with children, so write-protection and COW won't come into it.
> > > 
> > > or if there's good reason why that's no good,
> > > 
> > > 2.  Define a MAP_DONTCOPY to mmap: we have a fine tradition of MAP_flags
> > >     to achieve this or that effect, adding one more would be cleaner than
> > >     now corrupting mprotect or madvise.
> > 
> > They are both relying on the way user allocates memory for RDMA. The idea
> > behind Michael's propose it to let library (MPI for instance) to tell to the
> > kernel that the pages are used for RDMA and it is not safe to copy them now. 
> > The pages may be anywhere in the process address space bss, text, stack
> > whatever.
> 
> That's a nice aim, but I don't think it can quite be done in the face of
> the fork issue - one way or another, we have to change the behaviour of a
> forked RDMA area slightly, which might interfere with common assumptions.
> 
> Your stack example is a good one: if we end up setting VM_DONTCOPY on
> the user stack, then I don't think fork's child will get very far without
> hitting a SIGSEGV.
I know, but I prefer child SIGSEGV than silent data corruption. In most
cases child will exec immediately after fork so no problem in this
case.

--
			Gleb.



More information about the general mailing list