[ofa-general] Re: problem in follow_hugetlb_page on ppc64 architecture with get_user_pages

David Gibson dwg at au1.ibm.com
Tue Nov 6 14:31:19 PST 2007


On Tue, Nov 06, 2007 at 04:06:04PM +0100, Hoang-Nam Nguyen wrote:
> Hello Roland!
> > We currently see this when testing Infiniband on ppc64 with ehca +
> > hugetlbfs.
> > From reading the code this should also be an issue on other architectures.
> > Roland, Adam, are you aware of anything in this area with mellanox
> > Infiniband cards or other usages with I/O adapters?
> Below is a testcase demonstrating this problem. You need to install
> libhugetlbfs.so and run it as below:
> HUGETLB_MORECORE=yes LD_PRELOAD=libhugetlbfs.so ./hugetlb_ibtest 100
> 
> This testcase does the following steps (high level desc):
> 1. malloc two buffers each of 100MB for send and recv
> 2. register them as memory regions
> 3. create queue pair QP
> 4. send data in send buffer using QP to itself (target is then recv buffer)
> 5. compare those buffers content
> 
> It runs fine without libhugetlbsf. If you call it with libhugetlbfs as
> above, step 5 will fail. If you do memset() of the buffers before step 2
> (register mr), then it runs without errors.
> It appears that hugetlb_cow() is called when first write access is performed
> after mrs have been registered. That means the testcase is seeing other pages
> than the ones registered to the adapter...
> 
> I was able reproduce this with mthca on 2.6.23/ppc64 and fc6/intel.

We should cut this down to the bare necessary and fold it into the
libhugetlbfs testsuite.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson



More information about the general mailing list