[openib-general] RFD: uverbs and hotplug
Caitlin Bestler
caitlin.bestler at gmail.com
Tue May 3 09:42:09 PDT 2005
The fundamental question here is what degree of support is desired?
Is it to ensure that the OS can force a memory region to be destroyed
on demand, and it is then up to the application to figure out how to
respond?
Or is it to allow the OS to change the meaning of a Memory Region
so that it can re-arrange memory for infrequent maintenance events
related to hotplug memory?
When a user creates a memory region for a portion of its virtual
memory space it is expecting two things:
a) If it requested remote accessibility then it can advertise buffers
with this Memory Region's R-Key in good faith to its peers. It
is the equivalent of sending a certified check. You do not need
to know the serial number of the currency that the bank will
give when the check is cashed, but you are guaranteed that
the cash will be there. That is the point of getting the check
certified.
Similarly, registering memory with remote access indicates a
requirement by the user that the registered memory *will* be
available for remote access on demand, and the user can
relay this promise (in the form of a buffer advertisement) in
good faith.
b) That the RDMA device will translate Target Offsets and this R-Key
or L-Key in a manner consistent with the user's virtual memory
map. The RDMA Device's Memory Map may not be inconsistent
with the kernel's memory map.
Hotplug relates directly to point b. If the host needs to move the
mapping of some virtual pages to new physical pages it must do
so in a way that the RDMA Device's mapping cannot be out of
sync with the true mapping. Ultimately there are only two ways
to do this:
a) Eliminate the Memory Region.
b) Update the Memory Region.
Eliminating the Memory Region requires an interface to immediately
destroy a Memory Region, and then provide all of the required notifications
to the user application. Such an interface is simple, but will result in most
applications terminating. Most applications are not written to be able to
survice a sudden loss of virtually all of its connections, and then resume
all of the suspended sessions with new connections using new memory
regions.
Updating the Memory Region requires an interface that can suspend
use of the Memory Region while the mapping is updated. After the update
has completed the Memory Region can be resumed. While the Memory
Region is suspended the RDMA Device will not use it to access memory,
and will do what it can to keep the connections alive. InfiniBand has an
advantages here, in that RNRs can do a better job of this than iWARP
can do over TCP. But in either case, the expectation would be that the
OS would suspend a memory region, create the new mapping, do any
required copying, and then promptly resume the Memory Region.
This strategy is currently being discussed in RNIC-PI.
But which is used for Linux is a separate decision. The eliminate option
is simpler logic within the kernel, but forces applications to create
session layer persistence features. The suspend/resume option allows
physical memory migration transparently to the application, but requires
more co-operation between the OS and RDMA device drviers.
Caitlin Bestler
Director Software Architecture
Siliquent Technologies
caitlinb at siliquent.com
On 5/3/05, Michael S. Tsirkin <mst at mellanox.co.il> wrote:
> Hello, Roland, all!
> How should hotplug work with uverbs?
>
> Currently ib_uverbs_remove_one simply does class_device_unregister.
> What will if an application has the device open?
>
> Ideally, once the device is removed, applications
> get an error code and so can close the device freeing
> up any resources, and reopen another one.
> However, with libmthca we have a page remapped into
> the application memory and requests are posted there.
>
> One idea is to replace the hardware page mapped into the
> application with a map to zero page. This prevents
> the application from accessing hardware that's not there
> anymore, but does nothing to notify the application
> that it needs to take action.
>
> What do people think?
> Thanks,
>
> --
> MST - Michael S. Tsirkin
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list