[ofiwg] fork support and MR cache

Wed Oct 7 22:17:51 PDT 2020

There have been extensive discussions on github around the MR cache, deadlocks, libibverbs madvise tracking, and fork.  The current direction is to only enable the MR cache when fork is disabled.  This was done to work-around internal libibverbs tracking.  But I suspect that bypassing that tracking (which is possible) can still lead to issues when registrations are made through the MR cache.

However, the only time that madvise(DONTFORK) *needs* to be called is:

- immediately prior to calling fork()
- only on memory registrations actively in use

Currently, if the app *might* call fork(), madvise() is called as part of every memory registration/deregistration.  This has a negative impact on performance.  If we can defer calling madvise() until it is needed, then enabling fork() support for all apps would be possible, without impacting apps that don't call it.  Additionally, even if apps call fork(), we may be able to avoid calling madvise() for every registration.

To do this, we need:

- the ability to intercept fork()
- calling madvise() from the intercept routine.

The first might be possible by using the memhooks mechanism.  I don't know if there would be an issue with the second.

Assuming the above works, when fork() is called, cached registrations not in use can simply be flushed.  Registrations with a use_cnt > 0 need madvise() called on them.  Those registrations can be flushed once their use_cnt = 0, with madvise() called to re-enable fork.

Without the ability to intercept fork(), I don't see a way to enable the MR cache and also support fork().  Caching a registration and marking the memory with madvise(DONTFORK) has the potential to hide data from the forked process that an application might expect to find.

- Sean