[ofiwg] [libfabric-users] Using libfabric on Titan

Jeff Hammond jeff.science at gmail.com
Thu May 3 20:55:16 PDT 2018


I assume Howard means that the slowness issue is associated with KNL rather
than Theta itself. Cori has both KNL and Haswell nodes, and I assume the
Cori KNL nodes behave almost identically to the Theta nodes w.r.t. small
message performance.

Jeff

On Thu, May 3, 2018 at 12:36 PM Pritchard Jr., Howard <howardp at lanl.gov>
wrote:

> HI Philip,
>
> Yes.  If you hit bugs please open issues on
> https://github.com/ofiwg/libfabric
> and @hppritcha in the issue.
>
> Don’t expect very good performance on theta for small transfers.
>
> On theta, you’ll want to use the --enable-ugni-static
>
> configure option to help a little bit with those slow processors.
>
> You may also want to peruse
>
> https://github.com/ofi-cray/libfabric-cray/wiki/GNI-provider-building-it
>
>
> Good luck!
>
> Howard
>
>
> --
> Howard Pritchard
>
> B Schedule
> HPC-ENV
> Office 9, 2nd floor Research Park
> TA-03, Building 4200, Room 03U
>
> Los Alamos National Laboratory
>
>
>
>
>
> On 5/3/18, 1:24 PM, "Philip Davis" <philip.e.davis at rutgers.edu> wrote:
>
> >Thanks for the quick reply. That’s disappointing, but understandable. Is
> >the GNI provider supported on Cori and Theta?
> >
> >> On May 3, 2018, at 2:49 PM, Pritchard Jr., Howard <howardp at lanl.gov>
> >>wrote:
> >>
> >> the GNI provider isn’t supposed on Titan - OS is too old.
> >>
> >>
> >> --
> >> Howard Pritchard
> >>
> >> B Schedule
> >> HPC-ENV
> >> Office 9, 2nd floor Research Park
> >> TA-03, Building 4200, Room 03U
> >>
> >> Los Alamos National Laboratory
> >>
> >>
> >>
> >>
> >>
> >> On 5/3/18, 11:21 AM, "Hefty, Sean" <sean.hefty at intel.com> wrote:
> >>
> >>> Copying ofiwg, Howard, and Jim.
> >>>
> >>>
> >>>> Apologies if this has been asked before.
> >>>>
> >>>> I am trying to use the GNI provider for libfabric on Titan, the XK7
> >>>> machine at ORNL. Is this possible/supported?
> >>>>
> >>>> More specifically, I need to be able to communicate between aprun
> >>>> instances (e.g. aprun ./writer and aprun ./reader are able to
> >>>> communicate with each other) so the sharing of credentials between the
> >>>> programs is necessary. I have been referring to the cray-tests repo as
> >>>> an example of how to use rdma credentials for Cray in Libfabric. I
> >>>> modified this code somewhat by replacing all DRC code with just
> >>>> setting the cookie manually, and had hoped that would work.
> >>>> Unfortunately, the library relies on GNI_GetPtag to set the ptag, and
> >>>> this API call does not seem to work on Titan. I can get the
> >>>> auth_keys_test to pass by adding a ptag field to the
> >>>> fi_gni_raw_auth_key struct along with appropriate language to handle
> >>>> that, but I was wondering if there was some inherently supported way
> >>>> to do that same thing without modifying libfabric.
> >>>>
> >>>> Other than credentials exchange/loading, i am wondering what I can
> >>>> expect in trying to use the GNI provider on a non-ARIES system. Has
> >>>> anyone done this successfully?
> >>>>
> >>>> Thanks,
> >>>> Philip
> >>
> >
>
> _______________________________________________
> Libfabric-users mailing list
> Libfabric-users at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/libfabric-users
>
-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofiwg/attachments/20180504/5b98c08c/attachment.html>


More information about the ofiwg mailing list