[Ofmfwg] OFMF meeting agenda for 24 January 2025

Aguilar, Michael James mjaguil at sandia.gov
Thu Jan 23 15:12:51 PST 2025


Everyone

I am working to pull together Flux with Sunfish.  We have made some initial decisions.  We are going to avoid using jobtap scripts.  Instead, we are initially planning on using standard Flux Prolog/Epilog scripts.

Mike

So, we are:


  1.  Working on new Flux Prolog/Epilog scripts that will take passed in parameters, from the users, and aggregate either CXL fabric-attached-memory or NVMeoF to running compute nodes.
  2.  Sunfish will prepare the memory or NVMe components, together with the fabric access as resources passed and provided to Flux.
  3.  Flux uses hwloc to ‘position’ components for selection.  Sunfish will use it’s Events framework to update Flux with available hardware information.
  4.  As Flux picks components for aggregation, Sunfish updates it’s tables and talks to Sunfish Hardware Agents for selection.
  5.  Sunfish will then, in turn, notify Flux of aggregation success and with updated information on the components.
  6.  We are discussing having Flux and Sunfish be deployed across each compute node, as Flux is already deployed and running.

Meeting Agenda:


  1.  H3 demonstration of CXL attachment of FAM using Sunfish, part 2
  2.  Further discussion of how Sunfish will provide Flux with updated hardware information
  3.  Further discussion of deployment of Sunfish and Flux across the compute nodes.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofmfwg/attachments/20250123/1c7350b8/attachment.htm>


More information about the Ofmfwg mailing list