[ewg] [PATCH v2] libibverbs: ibv_fork_init() and libhugetlbfs

Alexander Schmidt alexs at linux.vnet.ibm.com
Mon Jun 28 08:18:29 PDT 2010


On Thu, 10 Jun 2010 17:59:28 +0300
Alex Vainman <alexonlists at gmail.com> wrote:

> Wrote Roland Dreier:
> > Thanks, nice work.  I like this approach.  Alex (Vainman) any comments
> > on this?
> > 
> >  - R.
> 
> The solution looks great.

Hi all,

in our further testing, we noticed that there is a substantial problem with
the current solution. Depending on the order of memory registrations, we might
end up with a corrupted node tree which blocks regions from being registered.

 When registering two memory regions A and B from within
the same huge page, we will end up with one node in the tree which covers the
whole huge page after registering A. When the second MR is registered, a node
is created with the MR size rounded to the system page size (as there is no
need to call madvise(), it is not noticed that MR B is part of a huge page).

Now if MR A is deregistered before MR B, I see that the tree containing
mem_nodes is empty afterwards, which causes problems for the deregistration of
MR B, leaving the tree in a corrupted state with negative refcounts. This also
breaks later registrations of other memory regions within this huge page.

At the moment I do not see an obvious solution for this, but it's clear that
an overhaul of this code is needed. I'm writing this to make sure that there
won't be a release of libibverbs containing this incomplete code, but also
to ask for comments from other people who might have an idea on how to fix
this.

Thanks for any comments!

Alex



More information about the ewg mailing list