[libfabric-users] Using libfabric on Titan
Philip Davis
philip.e.davis at rutgers.edu
Thu May 3 12:46:02 PDT 2018
Thank you both, I appreciate the assistance. I will raise any issues I find.
> On May 3, 2018, at 3:38 PM, James Swaro <jswaro at cray.com> wrote:
>
> I would add that you can also attach my name to the issue.
>
> @jswaro
>
> Feel free to reach out. I'd have answered earlier, but Howard already caught what I was going to say.
>
> Regards,
> -- Jim
>
> On 5/3/18, 2:36 PM, "Pritchard Jr., Howard" <howardp at lanl.gov> wrote:
>
> HI Philip,
>
> Yes. If you hit bugs please open issues on
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fofiwg%2Flibfabric&data=02%7C01%7Cphilip.e.davis%40rutgers.edu%7C183d634c6c9440eb1b1308d5b12d7e32%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636609731314422623&sdata=S6lWq2ZzMcgusMPIbdrlDHhUyBEjt7ocIIwxo5tY%2Fbc%3D&reserved=0
> and @hppritcha in the issue.
>
> Don’t expect very good performance on theta for small transfers.
>
> On theta, you’ll want to use the --enable-ugni-static
>
> configure option to help a little bit with those slow processors.
>
> You may also want to peruse
>
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fofi-cray%2Flibfabric-cray%2Fwiki%2FGNI-provider-building-it&data=02%7C01%7Cphilip.e.davis%40rutgers.edu%7C183d634c6c9440eb1b1308d5b12d7e32%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636609731314422623&sdata=goljiFdOOBaqfZ%2Fgo3WWWYBrnvf0MfA0uFGXKPM7mIM%3D&reserved=0
>
>
> Good luck!
>
> Howard
>
>
> --
> Howard Pritchard
>
> B Schedule
> HPC-ENV
> Office 9, 2nd floor Research Park
> TA-03, Building 4200, Room 03U
>
> Los Alamos National Laboratory
>
>
>
>
>
> On 5/3/18, 1:24 PM, "Philip Davis" <philip.e.davis at rutgers.edu> wrote:
>
>> Thanks for the quick reply. That’s disappointing, but understandable. Is
>> the GNI provider supported on Cori and Theta?
>>
>>> On May 3, 2018, at 2:49 PM, Pritchard Jr., Howard <howardp at lanl.gov>
>>> wrote:
>>>
>>> the GNI provider isn’t supposed on Titan - OS is too old.
>>>
>>>
>>> --
>>> Howard Pritchard
>>>
>>> B Schedule
>>> HPC-ENV
>>> Office 9, 2nd floor Research Park
>>> TA-03, Building 4200, Room 03U
>>>
>>> Los Alamos National Laboratory
>>>
>>>
>>>
>>>
>>>
>>> On 5/3/18, 11:21 AM, "Hefty, Sean" <sean.hefty at intel.com> wrote:
>>>
>>>> Copying ofiwg, Howard, and Jim.
>>>>
>>>>
>>>>> Apologies if this has been asked before.
>>>>>
>>>>> I am trying to use the GNI provider for libfabric on Titan, the XK7
>>>>> machine at ORNL. Is this possible/supported?
>>>>>
>>>>> More specifically, I need to be able to communicate between aprun
>>>>> instances (e.g. aprun ./writer and aprun ./reader are able to
>>>>> communicate with each other) so the sharing of credentials between the
>>>>> programs is necessary. I have been referring to the cray-tests repo as
>>>>> an example of how to use rdma credentials for Cray in Libfabric. I
>>>>> modified this code somewhat by replacing all DRC code with just
>>>>> setting the cookie manually, and had hoped that would work.
>>>>> Unfortunately, the library relies on GNI_GetPtag to set the ptag, and
>>>>> this API call does not seem to work on Titan. I can get the
>>>>> auth_keys_test to pass by adding a ptag field to the
>>>>> fi_gni_raw_auth_key struct along with appropriate language to handle
>>>>> that, but I was wondering if there was some inherently supported way
>>>>> to do that same thing without modifying libfabric.
>>>>>
>>>>> Other than credentials exchange/loading, i am wondering what I can
>>>>> expect in trying to use the GNI provider on a non-ARIES system. Has
>>>>> anyone done this successfully?
>>>>>
>>>>> Thanks,
>>>>> Philip
>>>
>>
>
>
>
More information about the Libfabric-users
mailing list