[ofa-general] Problem with libibverbs and huge pages registration.

Gleb Natapov glebn at voltaire.com
Tue Apr 22 04:14:13 PDT 2008


On Mon, Apr 21, 2008 at 02:53:51PM -0700, Roland Dreier wrote:
>  >    ibv_reg_mr() fails if I try to register a memory region backed by a
>  > huge page, but is not aligned to huge page boundary. Digging deeper I
>  > see that libibverbs aligns memory region to a regular page size and
>  > calls madvise() and the call fails. See program below to reproduce.
>  > The program assumes that hugetlbfs is mounted on /huge and there is at
>  > least one huge page available. I am not use it is possible to know if a
>  > memory buffer is backed by huge page to solve the problem.
> 
> Hmm, not sure off the top of my head how we should deal with this.
Me too :(

> 
>  > Another issue with libibverbs is that after first ibv_reg_mr() fails the
>  > second registration attempt of the same buffer succeed since
>  > ibv_madvise_range() doesn't cleanup after madvice failure and thinks
>  > that memory is already "madvised".
> 
> I guess we shouldn't change the refcnt until after we know if madvise
> has succeeded or not.  Does the patch below help?  I'm not sure if this
> is a good enough fix -- we might have split up a node and want to
> remerge it if the madvise fails... rolling back is a little tricky... I
> think this will take a little more thought.
> 
>  - R.
> 
> --- a/src/memory.c
> +++ b/src/memory.c
> @@ -506,8 +506,6 @@ static int ibv_madvise_range(void *base, size_t size, int advice)
>  			__mm_add(tmp);
>  		}
>  
> -		node->refcnt += inc;
> -
I suppose "if" below depends on updated refcnt, so update can't be moved
down without changing the "if" statement.

>  		if ((inc == -1 && node->refcnt == 0) ||
>  		    (inc ==  1 && node->refcnt == 1)) {
>  			/*
> @@ -532,6 +530,8 @@ static int ibv_madvise_range(void *base, size_t size, int advice)
>  				goto out;
>  		}
>  
> +		node->refcnt += inc;
> +
>  		node = __mm_next(node);
>  	}
>  

--
			Gleb.



More information about the general mailing list