[Ofmfwg] Meeting notes for March 7, 2025

Lee, Peter peter.lee at necam.com
Fri Mar 7 09:03:25 PST 2025


  1.  H3 CXLand CXL Agent work
     *   No change in status since last week.
     *   Welly will share remote access information with Russ.
  2.  Sunfish documentation
     *   Need to have an official update since it has been a while since an official update.
     *   Will figure out the new sections and assign out the work to do.
     *   Documentation is at https://github.com/OpenFabrics/sunfish_docs. If there is anything that does not look right, please add comments.
     *   Next version will be done in another branch.
  3.  Update to Sunfish Server Reference code
     *   No changes to the repo.
     *   Main core library changes have been pulled to the main branch.
     *   Need to use rherrell_fix_8_9_10 branch to have Sunfish reference function as an API frontend for either Sunfish service or agent service. Still need to develop a readme to explain how to use this code as a frontend for Sunfish or agent and how to tie them all together. The OFMF server has a working set up under the Russ playpen. Ask Russ or Phil for help if you want to try.
  4.  Quick report on using Containers as endpoints for our JBODs----lack of information on RDMA builds, we can table it, for now
     *   Found lots of information from NVIDIA. Will continue to pursue this at Sandia, with help from a container expert.
  5.  Sunfish NVMeoF Agent deep-dive for requirements and specs

     *   Continue timing diagram for gathering NVMeoF endpoints for Sunfish CDI (diagrams attached in the PDF)
        *   Discussion of how NVMeoF works.
           *   Each NVMe server, which contains devices, has at least one DDC to expose resources on each subsystem, whether it is appliance, JBOD or JBOF. CDC is a rack solution that looks at each entity that exposes NVMeoF and acts as a one stop shop for hosts to get these NVMe resources.
           *   Each target has at least one DDC. When you go to the DDC, it will return a discovery log page. The discover log page lists subsystems that is available and their associated data.
           *   CDC can be informed of a list of addresses to connect to DDCs and there is also a way to find those addresses automatically (using a similar protocol for printers). The diagram shows DDCs contacting the CDC to say here I am. But this is implementation specific.
        *   Whether or not a CDC is part of the NVMe agent is implementation specific. For this diagram, CDC is part of the NVMe agent.
        *   In the diagram, only the admin communicates with Sunfish. Communications between the admin and the Flux brokers are outside of Sunfish. Admin assign resources to broker 0. Broker 0 then assigns resources to the other brokers, based on their job requirements. If want to make dynamic changes to what resources are available to broker 0, that is between the admin and broker 0.
           *   Admin can be dumb (just a series of scripts) or smart.
        *   Mike will attempt to put this into code this weekend. Once we have the code, we can better see how things work.
  1.  Flux Prolog/Epilog final requirements and specs

     *   Initial implementation of Flux Broker Prolog/Epilog will need to just request an address to connect and disconnect from
        *   We know the connect and disconnect scheme. Do the connection, build the burst buffer, tear it down with epilog and then disconnect.
        *   Should have this available by the end of the month,
  1.  V1.0 target
     *   Goal is to have v1.0 by the end of the year.
     *   With a target application and use case (Flux with NVMeoF), we can move quickly and learn from it so we can apply it to CXL agent.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofmfwg/attachments/20250307/a2ba3bd1/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Flow.pdf
Type: application/pdf
Size: 217366 bytes
Desc: Flow.pdf
URL: <http://lists.openfabrics.org/pipermail/ofmfwg/attachments/20250307/a2ba3bd1/attachment-0001.pdf>


More information about the Ofmfwg mailing list