[ofiwg] Using libfabric on Titan

Pritchard Jr., Howard howardp at lanl.gov
Thu May 3 12:36:00 PDT 2018


HI Philip,

Yes.  If you hit bugs please open issues on
https://github.com/ofiwg/libfabric
and @hppritcha in the issue.

Don’t expect very good performance on theta for small transfers.

On theta, you’ll want to use the --enable-ugni-static

configure option to help a little bit with those slow processors.

You may also want to peruse

https://github.com/ofi-cray/libfabric-cray/wiki/GNI-provider-building-it


Good luck!

Howard


-- 
Howard Pritchard

B Schedule
HPC-ENV
Office 9, 2nd floor Research Park
TA-03, Building 4200, Room 03U

Los Alamos National Laboratory





On 5/3/18, 1:24 PM, "Philip Davis" <philip.e.davis at rutgers.edu> wrote:

>Thanks for the quick reply. That’s disappointing, but understandable. Is
>the GNI provider supported on Cori and Theta?
>
>> On May 3, 2018, at 2:49 PM, Pritchard Jr., Howard <howardp at lanl.gov>
>>wrote:
>> 
>> the GNI provider isn’t supposed on Titan - OS is too old.
>> 
>> 
>> -- 
>> Howard Pritchard
>> 
>> B Schedule
>> HPC-ENV
>> Office 9, 2nd floor Research Park
>> TA-03, Building 4200, Room 03U
>> 
>> Los Alamos National Laboratory
>> 
>> 
>> 
>> 
>> 
>> On 5/3/18, 11:21 AM, "Hefty, Sean" <sean.hefty at intel.com> wrote:
>> 
>>> Copying ofiwg, Howard, and Jim.
>>> 
>>> 
>>>> Apologies if this has been asked before.
>>>> 
>>>> I am trying to use the GNI provider for libfabric on Titan, the XK7
>>>> machine at ORNL. Is this possible/supported?
>>>> 
>>>> More specifically, I need to be able to communicate between aprun
>>>> instances (e.g. aprun ./writer and aprun ./reader are able to
>>>> communicate with each other) so the sharing of credentials between the
>>>> programs is necessary. I have been referring to the cray-tests repo as
>>>> an example of how to use rdma credentials for Cray in Libfabric. I
>>>> modified this code somewhat by replacing all DRC code with just
>>>> setting the cookie manually, and had hoped that would work.
>>>> Unfortunately, the library relies on GNI_GetPtag to set the ptag, and
>>>> this API call does not seem to work on Titan. I can get the
>>>> auth_keys_test to pass by adding a ptag field to the
>>>> fi_gni_raw_auth_key struct along with appropriate language to handle
>>>> that, but I was wondering if there was some inherently supported way
>>>> to do that same thing without modifying libfabric.
>>>> 
>>>> Other than credentials exchange/loading, i am wondering what I can
>>>> expect in trying to use the GNI provider on a non-ARIES system. Has
>>>> anyone done this successfully?
>>>> 
>>>> Thanks,
>>>> Philip
>> 
>



More information about the ofiwg mailing list