[openib-general] Re: Question about pinning memory

Mon Jul 25 08:27:43 PDT 2005

>> > Having said that - do you have some benchmarks to prove that
>> > pinning is that expensive? Are you sure that all these internal
>> > table
>> > lookups aren't even slower?
>> > I cant promise, but given a microbenchmark, maybe something could be
>> > done to
>> > improve the speed of this operation.
>>
>> In my experience cache hit in MVAPICH improves registration by factor
>> of
>> two.
>>
>But thats VAPI based, not gen2 based, isnt it?
>
What is the difference? gen2 do magic? 

>> Internal table lookup is a search in balanced tree. Hardware memory
>> registration is system call + writes on PCI bus + search in balanced
>> tree in get_user_pages,
>
>I cant tell without seeing actual benchmarks.
>Certainly tricks like overriding malloc/free operations can have
>impact on overall application performance.
There is no need to override malloc/free to do registration caching.
And here is the benchmark results for MVAPICH over gen2:
This is how much cycles takes to send buffer of 100000 bytes with cache
miss:      536388
cache hit: 9740

I used MPI_Isend for the test.

>
>> how can you improve this?
>
>I have some optimization ideas in mind - clearly even with a very smart
>registration cache there may be cache misses, so if a problem exists,
>any optimizations shall be beneficial.

It is always good to improve worst case.

--
			Gleb.