[ofa-general] Memory registration redux
Roland Dreier
rdreier at cisco.com
Tue Jun 2 09:44:59 PDT 2009
> Sorry, it's kind of difficult to deduce looking at this Q&A sequence
> what works how and when. Is it possible to create a brief and direct
> description of the proposed solution?
Did you see the original patch description I sent:
As discussed in <http://article.gmane.org/gmane.linux.drivers.openib/61925>
and follow-up messages, libraries using RDMA would like to track
precisely when application code changes memory mapping via free(),
munmap(), etc. Current pure-userspace solutions using malloc hooks
and other tricks are not robust, and the feeling among experts is that
the issue is unfixable without kernel help.
We solve this not by implementing the full API proposed in the email
linked above but rather with a simpler and more generic interface,
which may be useful in other contexts. Specifically, we implement a
new character device driver, ummunot, that creates a /dev/ummunot
node. A userspace process can open this node read-only and use the fd
as follows:
1. ioctl() to register/unregister an address range to watch in the
kernel (cf struct ummunot_register_ioctl in <linux/ummunot.h>).
2. read() to retrieve events generated when a mapping in a watched
address range is invalidated (cf struct ummunot_event in
<linux/ummunot.h>). select()/poll()/epoll() and SIGIO are handled
for this IO.
3. mmap() one page at offset 0 to map a kernel page that contains a
generation counter that is incremented each time an event is
generated. This allows userspace to have a fast path that checks
that no events have occurred without a system call.
More information about the general
mailing list