[ofiwg] Hugepages usage in libfabric
jswaro at cray.com
Wed Apr 10 07:36:53 PDT 2019
I’d be interested in exposing an environment variable, flag, or general tunable in libfabric for applications to indicate whether they want to use huge pages. Using the verbs provider on an internal development system, I’ve run into an issue where use of huge pages makes the memory registration function fail unless RDMAV_HUGEPAGES_SAFE is set. See https://lists.openfabrics.org/pipermail/ewg/2010-July/015609.html for context.
The general idea for the change would be a flag/environment variable which would control common core or provider memory allocation routines for buffers that would be registered with the NIC. If set, it would indicate that the application intends to use huge pages and should take appropriate action. If not, the providers could assume that huge pages won’t be used. I had considered reusing the RDMAV_HUGEPAGES_SAFE environment variable, but I did not want to make conditional behavior in common core code based on a variable specific to verbs. Especially if that variable could be generally set on a system and not specifically indicative of the application’s intended use of huge pages.
To me, it would be ideal if we can expose some knob to users to indicate expectations regarding the use of hugepages – regardless of what that knob is. The rationale for this is based on the non-zero cost of detecting page sizes at registration time, especially for providers that are operating under the FI_LOCAL_MR mode bit. Common core code used by a provider has an effect on hugepage usage and a tunable for enabling or disabling hugepage usage could be useful for diagnostics, debugging, and/or performance.
I’m not a huge fan of environment variables, but I do see a lack of a solution for libfabric to indicate hugepage usage. Maybe it’s not necessary, but I wanted to start the discussion to see what other ideas are out there.
For the verbs provider, I believe the simple first step is to set the environment variable to correct the failing behavior – and then determine how to handle huge pages support in general for libfabric in general, if necessary. For verbs, use of the RDMAV_HUGEPAGES_SAFE variable tends to impact performance mostly after the rendezvous threshold. After reaching the rendezvous threshold, latency for data transfer operations appears to increase by a factor of ten.
Specifically, this occurs with verbs and rxm when rxm attempts to register a buffer for transfer using space from the local buffer pool. See https://github.com/ofiwg/libfabric/blob/master/prov/util/src/util_buf.c#L115 for the exact point of failure. (with rxm_buf_reg as the function being used in that pointer)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ofiwg