[ofa-general] Memory registration redux
Supalov, Alexander
alexander.supalov at intel.com
Mon Jun 8 05:42:25 PDT 2009
Hi,
Intel MPI developers are in principle OK with this proposal. What way of delivery is envisioned? Will this become a part of OFED or of the mainstream kernel? How fast will it spread? Are there any comparable Windows plans?
Best regards.
Alexander
-----Original Message-----
From: Supalov, Alexander
Sent: Wednesday, June 03, 2009 12:26 PM
To: 'Roland Dreier'
Cc: Jeff Squyres; Pavel Shamis; Hans Westgaard Ry; Dontje; Lenny Verkhovsky; H??kon Bugge; Donald Kerr; OpenFabrics General
Subject: RE: [ofa-general] Memory registration redux
Thanks. This is what I was looking for. Let me pass this by the key Intel MPI developers and get back to you.
-----Original Message-----
From: Roland Dreier [mailto:rdreier at cisco.com]
Sent: Tuesday, June 02, 2009 6:45 PM
To: Supalov, Alexander
Cc: Jeff Squyres; Pavel Shamis; Hans Westgaard Ry; Dontje; Lenny Verkhovsky; H??kon Bugge; Donald Kerr; OpenFabrics General
Subject: Re: [ofa-general] Memory registration redux
> Sorry, it's kind of difficult to deduce looking at this Q&A sequence
> what works how and when. Is it possible to create a brief and direct
> description of the proposed solution?
Did you see the original patch description I sent:
As discussed in <http://article.gmane.org/gmane.linux.drivers.openib/61925>
and follow-up messages, libraries using RDMA would like to track
precisely when application code changes memory mapping via free(),
munmap(), etc. Current pure-userspace solutions using malloc hooks
and other tricks are not robust, and the feeling among experts is that
the issue is unfixable without kernel help.
We solve this not by implementing the full API proposed in the email
linked above but rather with a simpler and more generic interface,
which may be useful in other contexts. Specifically, we implement a
new character device driver, ummunot, that creates a /dev/ummunot
node. A userspace process can open this node read-only and use the fd
as follows:
1. ioctl() to register/unregister an address range to watch in the
kernel (cf struct ummunot_register_ioctl in <linux/ummunot.h>).
2. read() to retrieve events generated when a mapping in a watched
address range is invalidated (cf struct ummunot_event in
<linux/ummunot.h>). select()/poll()/epoll() and SIGIO are handled
for this IO.
3. mmap() one page at offset 0 to map a kernel page that contains a
generation counter that is incremented each time an event is
generated. This allows userspace to have a fast path that checks
that no events have occurred without a system call.
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
More information about the general
mailing list