[ofiwg] Using libfabric on Titan

James Swaro jswaro at cray.com
Thu May 3 12:38:18 PDT 2018


I would add that you can also attach my name to the issue. 

@jswaro

Feel free to reach out. I'd have answered earlier, but Howard already caught what I was going to say. 

Regards, 
-- Jim
 
On 5/3/18, 2:36 PM, "Pritchard Jr., Howard" <howardp at lanl.gov> wrote:

    HI Philip,
    
    Yes.  If you hit bugs please open issues on
    https://github.com/ofiwg/libfabric
    and @hppritcha in the issue.
    
    Don’t expect very good performance on theta for small transfers.
    
    On theta, you’ll want to use the --enable-ugni-static
    
    configure option to help a little bit with those slow processors.
    
    You may also want to peruse
    
    https://github.com/ofi-cray/libfabric-cray/wiki/GNI-provider-building-it
    
    
    Good luck!
    
    Howard
    
    
    -- 
    Howard Pritchard
    
    B Schedule
    HPC-ENV
    Office 9, 2nd floor Research Park
    TA-03, Building 4200, Room 03U
    
    Los Alamos National Laboratory
    
    
    
    
    
    On 5/3/18, 1:24 PM, "Philip Davis" <philip.e.davis at rutgers.edu> wrote:
    
    >Thanks for the quick reply. That’s disappointing, but understandable. Is
    >the GNI provider supported on Cori and Theta?
    >
    >> On May 3, 2018, at 2:49 PM, Pritchard Jr., Howard <howardp at lanl.gov>
    >>wrote:
    >> 
    >> the GNI provider isn’t supposed on Titan - OS is too old.
    >> 
    >> 
    >> -- 
    >> Howard Pritchard
    >> 
    >> B Schedule
    >> HPC-ENV
    >> Office 9, 2nd floor Research Park
    >> TA-03, Building 4200, Room 03U
    >> 
    >> Los Alamos National Laboratory
    >> 
    >> 
    >> 
    >> 
    >> 
    >> On 5/3/18, 11:21 AM, "Hefty, Sean" <sean.hefty at intel.com> wrote:
    >> 
    >>> Copying ofiwg, Howard, and Jim.
    >>> 
    >>> 
    >>>> Apologies if this has been asked before.
    >>>> 
    >>>> I am trying to use the GNI provider for libfabric on Titan, the XK7
    >>>> machine at ORNL. Is this possible/supported?
    >>>> 
    >>>> More specifically, I need to be able to communicate between aprun
    >>>> instances (e.g. aprun ./writer and aprun ./reader are able to
    >>>> communicate with each other) so the sharing of credentials between the
    >>>> programs is necessary. I have been referring to the cray-tests repo as
    >>>> an example of how to use rdma credentials for Cray in Libfabric. I
    >>>> modified this code somewhat by replacing all DRC code with just
    >>>> setting the cookie manually, and had hoped that would work.
    >>>> Unfortunately, the library relies on GNI_GetPtag to set the ptag, and
    >>>> this API call does not seem to work on Titan. I can get the
    >>>> auth_keys_test to pass by adding a ptag field to the
    >>>> fi_gni_raw_auth_key struct along with appropriate language to handle
    >>>> that, but I was wondering if there was some inherently supported way
    >>>> to do that same thing without modifying libfabric.
    >>>> 
    >>>> Other than credentials exchange/loading, i am wondering what I can
    >>>> expect in trying to use the GNI provider on a non-ARIES system. Has
    >>>> anyone done this successfully?
    >>>> 
    >>>> Thanks,
    >>>> Philip
    >> 
    >
    
    



More information about the ofiwg mailing list