[openib-general] Question about pinning memory

Mon Jul 25 08:37:29 PDT 2005

glebn at voltaire.com wrote on Mon, 25 Jul 2005 17:37 +0300:
> On Mon, Jul 25, 2005 at 07:16:48AM -0700, Roland Dreier wrote:
> > How much of a performance hit is this system call?  It seems that you
> > need to do the call before every single work request is posted, which
> > could cause latency to go way up.
>
> I think the better approach would be to send signal to process whenever
> vma list changes and let application verify whether cache entries
> should be freed.

The system call is a nonblocking read() that returns 0 bytes when
there is no VM activity to report.

On my crusty old 1.5 GHz athlon, lmbench says:

    Simple syscall: 0.2337 microseconds
    Simple read: 0.3681 microseconds
    Signal handler overhead: 1.680 microseconds

I agree that the cost of kernel->user notification should scale with
VM activity and not message passing activity.  One way I tried was
to run a separate thread to block on the read() from the VM monitor
while the main process grabs an uncontended mutex around each cache
lookup instead of doing the read() itself.  This has a slight
scheduling race though.

I like your signal idea but am not fond of the signal mechanism,
especially as we (the library) can't trust the user not to break it.
Signaling a separate thread has the same race problem.  I'd prefer
to export a counter from the kernel that the user application can
see.  The kernel would atomically increment it when there is an
unread event, and the app can check the variable to decide if the
read() is necessary.

Give me some time to find the code and see if it still compiles
against recent 2.6.

		-- Pete