[ofa-general] RE: New proposal for memory management

Woodruff, Robert J robert.j.woodruff at intel.com
Wed Apr 29 10:56:27 PDT 2009


Jeff wrote,
>Is anyone going to comment on this?  I'm surprised / disappointed that
>it's been over 2 weeks with *no* comments.

>Roland can't lead *every* discussion...

Having a memory registration cache in the kernel seems like a bad idea to me.
It will likely be a lot of code that is very complicated and prone to bugs
that are not easy to find or fix. In the past, caching of
things like SA records, i.e., the local sa cache have been rejected and
to me this seems like a similar type of request. 

In general if something can be done in user-space rather than the kernel,
I think it should be done in user-space. MPIs today are clearly able to
implement this type of caching in user-space. Rather than dump a whole bunch
of new code into the kernel, why not make it a user-space library instead.
If libc needs changes to allow additional hooking of things like malloc/free,
then work with the libc maintainer to get those hooks into libc. I think there
is already a standard way to do this for libc malloc/free. 

As for the automatic registration/deregistration, I do not think you really want this
either. If it requires dynamic paging in and locking of pages that are not
in memory or locked, this will lead to severe variability in job performance run to run,
depending on system load and such, and I do not think you really want that.
For example, if one node has to delay to have a page paged in so that it can 
be locked and registered, it can delay all of the nodes in the cluster that
are waiting for that node to say respond to a collective operation, thus
slowing down the whole job. 

my 2 cents,

woody




More information about the general mailing list