[Ofa_remotepm] Additional notes from remote PM thinktank

Byrne, John (Labs) john.l.byrne at hpe.com
Wed Apr 11 18:12:28 PDT 2018


After looking at Doug's version of the slides, I have a couple of random thoughts:

Storage Model: The local case had file and volume models. These are generally understood. What are the correct models for the remote case and what manages them?

Power: I mentioned this Monday night, but it doesn't seem to be in the notes. For massive amounts of PM this becomes important for both TCO, density, and power-budget reasons. For example, my understanding is that Exascale RFPs have a desired power target and a hard cap. So for the checkpoint case, RPM helps with both BW and meeting the overall power budget. Not unique to the remote case, but it seems to be more likely in RPM scale-out solutions.

Wear-leveling and error handling: is this different in the remote case?  Depends a lot on how it is done, which was out of scope in the local case, if I read correctly. However, there may be a requirement that a HCA informs both the local and remote sides of an operation on error.

That's all for now.

John Byrne

From: Ofa_remotepm [mailto:ofa_remotepm-bounces at lists.openfabrics.org] On Behalf Of Voigt, Doug
Sent: Tuesday, April 10, 2018 1:09 PM
To: ofa_remotepm at lists.openfabrics.org
Subject: [Ofa_remotepm] Additional notes from remote PM thinktank


I added slides 12 - 16 to the prior slide deck.  My notes focused on use case and gap enumeration.  There is some overlap with the other slides.

Doug

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofa_remotepm/attachments/20180412/87e67003/attachment.html>


More information about the Ofa_remotepm mailing list