[ofa-general] ibv_post_send fails when using malloc in a special way

Asmund Ostvold aostvold at platform.com
Thu Dec 18 04:41:43 PST 2008


Roland,

Roland Dreier wrote:

> I'm not sure how madvise() would have any relevance to your problem,
> since as far as I can see you are not using fork().  In any case,
> libibverbs will only call madvise() if you call ibv_fork_init() or set
> the IBV_FORK_SAFE environment variable.

madvise() is probably not relevant. We _do_ see calls to madvise() in 
the enclosed program, without any IBV_FORK_SAFE environment variable 
(and the program does not call fork(), and I assume ibverbs neither do). 
Snip from ltrace:

free(0x2ae80c0a1000 <unfinished ...>
SYS_madvise(0x2ae80c0a9000, ...) = 0
<... free resumed> ) = <void>

We assume that the virtual-to-physical mapping of a region which has 
been initialized (initial page fault) and has been registered with 
ibv_reg_mr() only changes by 1) negative sbrk() or 2) munmap()/mremap(). 
Neither of this happens in the enclosed program. Given that our 
assumption is correct, we still claim the program shows a bug, since the 
sender RDMAs incorrect data.

PS: If you remove the huge failing allocation, a:

    (void) mallopt(M_TRIM_THRESHOLD, -1);


must be inserted in the top of main() to avoid 1) above. By doing this 
(removing the huge malloc() and inserting the mallopt()), the program 
work as expected.



- Asmund



More information about the general mailing list