[ofiwg] OFIWG 9/03/2024 Minutes

Tue Sep 3 18:59:56 PDT 2024

09/03/2024

Participants:
Alexia Ingerson (Intel)
Jianxin Xiong (Intel)
Ben Lynam (Cornelis)
Bob Cernohous (Cornelis)
Charles Sherada (Cornelis)
Chuck Fossen
Ian Ziemba (HPE)
Jack Morrison (Cornelis)
Jerome Soumagne
Jessie Yang (AWS)
John Byrne (HPE)
Ken Raffenetti (ANL)
Peinan Zhang (Intel)
Rajalaxmi (Intel)
Shi Jin (AWS)
Stephen Oost (Intel)
Steven Welch (HPE)
Zach Dworkin (Intel)

Summary:
Libfabric 2.0.0-alpha was released on 8/30 and includes all deprecations and a new provider (LPP). Beta release is targeted for 2 months from now (end of October). Please check new alpha release for any issues, especially compatibility issues for building. No new 1.x branches are expected though 1.x.y releases may occur as needed.
AWS presented on proposals to expand FI_HMEM interface capabilities. AWS, Intel, and HPE agree that adding fi_info->hmem_attr would be the best solution. AWS will draft up a more detailed proposal for this which includes adding features such as which interfaces support p2p (NIC to GPU) or optimized device memcopies.
Cornelis proposed adding an opx-specific yaml file to trigger their internal testing. There were no objections.

Notes:
Libfabric 2.0.0-alpha released 8/30

  *   Deprecations
  *   New LPP provider
Please check new alpha release for any issues, especially compatibility issues for building (added warnings for deprecated features) as well as running.
Beta release planned for 2 months from now (end of October)

  *   Expected all new features will make it into beta release
Q: What's the plan if need new 1.x release? Can we create new 1.x branch?
A: We can still do 1.x.y releases for bug fixes but shouldn't need new minor release
NCCL plug in currently cannot build with upstream, may need a fix for build issue.
Q: Are deprecated features going to be removed for official 2.0 release?
A: Will stick around for about a year before removal. Will take time for middlewares to officially remove use of deprecated features.

HMEM capability refinement (AWS presentation - Jessie, Shi)
Currently FI_HMEM is on/off capability which represents many interfaces
Proposal to introduce more capabilities to manage more specific FI_HMEM abilities
Add struct fi_hmem_attr *hmem_attr to info to return specific set of capabilities
struct fi_hmem_attr {
    enum fi_hmem_iface iface;
    bool use_p2p;
    bool use_dev_reg_copy;
    bool api_permitted:
    struct fi_hmem_attr *next;
};
iface: memory interface type (CUDA, ROCR, ZE)
use_p2p (accelerator to NIC p2p, not acc-acc p2p): whether peer to peer transfers should be used, filter out shm, all apps to specific p2p early
use_dev_reg_copy: whether to use optimized memcpy for dev memory (ie GDR)
api_permitted: whether dev specific API call is allowed, prevents unsafe operations and resource management conflicts
next: pointer to next hmem_attr if using multiple non-system ifaces

Struct will be both input and output - app can set fields to request settings and used to filter and select certain ifaces, etc
Q: Is it hard for a provider to onboard this new API because it's not using common code?
A: Could do this either way. Most of the information is in common code. A provider if they support certain attributes then it should go through common code. Use_p2p could be the only field not in common code.
Q: use_p2p should be clearer. It means provider should use p2p or allowed to use p2p?
A: It should follow the current common fi_use_p2p preferred or disabled where each one already has its defined behavior.
Api_permitted is definitely input value only - weird use case of fi_info attr. Very unclear - provider has to use API to use accelerator library. Even FI_HMEM initialization uses API code. Main use case for api_permitted is for MR registration and data transfer calls. Need to revisit this option.

Alternative: add fields to fi_mr_attr on memory registration path
struct fi_mr_attr {
    //existing fields
    bool use_p2p;
    bool use_dev_reg_copy;
    bool api_permitted;
};
Pros: no need for users to set values
Cons: scope is limited to MR registration, users cannot see these fields by running fi_info
Preference is fi_info path to see and set provider specific settings, wider use case
HPE, AWS, and Intel all prefer having it at the fi_info level
Q: any comments on input/output p2p settings
A: api_permitted setting is very important to them. Need to think more on input/output settings
AWS will continue to refine proposal for adding to fi_info and we will revisit

Cornelis Github Action workflow (Jack Morrison):
Go over cn.yaml addition in new PR github.com/ofiwg/libfabric/pull/10354<https://github.com/ofiwg/libfabric/pull/10354>
Trying to leverage more utilities available as part of Github Actions for opx testing.

  *   In cn.yaml - check to see if the PR is targeted for internal Cornelis libfabric repository. No-op if not targeted for internal repo
Runs on internal Cornelis machines
Will not get triggered for non-Cornelis/upstream PRs
Q: Why not have this only internal? What's the benefit of having it upstream?
A: Makes it easier to handle commits because of rebasing/upstreaming flow

No objections. Fine to move forward
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofiwg/attachments/20240904/1dc7d898/attachment.htm>