[libfabric-users] CXI libraries present but can not be compiled

Marc Caubet Serrabou marc.caubet at psi.ch
Tue Oct 8 01:34:40 PDT 2024


Hi Ken,

Thanks a lot for your answer. I just tried to cherry-pick your commit 
into the v1.22.0 tag, but then the compilation crashes for a different 
reason:

copying selected object files to avoid basename conflicts...
   CCLD     util/fi_strerror
   CCLD     util/fi_info
   CCLD     util/fi_pingpong
   CCLD     prov/cxi/test/multinode/test_frmwk
   CCLD     prov/cxi/test/multinode/test_zbcoll
   CCLD     prov/cxi/test/multinode/test_coll
   CCLD     prov/cxi/test/multinode/test_barrier
/usr/bin/ld: src/.libs/libfabric.so: undefined reference to `cxi_cq_empty'
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:12746: util/fi_strerror] Error 1
make[1]: *** Waiting for unfinished jobs....
/usr/bin/ld: src/.libs/libfabric.so: undefined reference to `cxi_cq_empty'
collect2: error: ld returned 1 exit status
/usr/bin/ld: src/.libs/libfabric.so: undefined reference to `cxi_cq_empty'
make[1]: *** [Makefile:12740: util/fi_pingpong] Error 1
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:12734: util/fi_info] Error 1
/usr/bin/ld: src/.libs/libfabric.a(src_libfabric_la-cxip_dom.o): in 
function `cxip_domain_find_cmdq':
cxip_dom.c:(.text+0x436): undefined reference to `cxi_cq_empty'
/usr/bin/ld: src/.libs/libfabric.a(src_libfabric_la-cxip_dom.o): in 
function `cxip_domain_find_cmdq':
cxip_dom.c:(.text+0x436): undefined reference to `cxi_cq_empty'
/usr/bin/ld: src/.libs/libfabric.a(src_libfabric_la-cxip_dom.o): in 
function `cxip_domain_find_cmdq':
cxip_dom.c:(.text+0x436): undefined reference to `cxi_cq_empty'
/usr/bin/ld: src/.libs/libfabric.a(src_libfabric_la-cxip_dom.o): in 
function `cxip_domain_find_cmdq':
cxip_dom.c:(.text+0x436): undefined reference to `cxi_cq_empty'
collect2: error: ld returned 1 exit status
collect2: error: ld returned 1 exit status
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:12658: prov/cxi/test/multinode/test_zbcoll] Error 1
make[1]: *** [Makefile:12648: prov/cxi/test/multinode/test_frmwk] Error 1
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:12638: prov/cxi/test/multinode/test_coll] Error 1
make[1]: *** [Makefile:12628: prov/cxi/test/multinode/test_barrier] Error 1
make[1]: Leaving directory '/var/tmp/caubet_m/libfabric-1.22.0/src'
make: *** [Makefile:6816: all] Error 2
libfabric/1.22.0: compilation failed!

Any ideas?

On the other hand, I also tried another proposed update from 
https://github.com/thomasgillis/libfabric/tree/dev-cxi, and with that 
one I can compile correctly, but then something is wrong:

🔥 
[caubet_m at login002:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]# 
fi_info -p cxi
fi_getinfo: -61 (No data available)

🔥 
[caubet_m at login002:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]# 
ldd $(which fi_info) | grep cxi
         libcxi.so.1 => /usr/lib64/libcxi.so.1 (0x00007f122b977000)

🔥 
[caubet_m at login002:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]# 
ldd $(which fi_info)
         linux-vdso.so.1 (0x00007ffcf52f6000)
         libfabric.so.1 => 
/opt/psi/Libraries/libfabric/1.22.0/lib64/libfabric.so.1 
(0x00007f255102a000)
         libcxi.so.1 => /usr/lib64/libcxi.so.1 (0x00007f2551004000)
         libnl-3.so.200 => /usr/lib64/libnl-3.so.200 (0x00007f2550c00000)
         libcurl.so.4 => /usr/lib64/libcurl.so.4 (0x00007f2550f5a000)
         libjson-c.so.3 => /usr/lib64/libjson-c.so.3 (0x00007f2550800000)
         libm.so.6 => /lib64/libm.so.6 (0x00007f2550ab4000)
         libuuid.so.1 => /usr/lib64/libuuid.so.1 (0x00007f2550f28000)
         libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x00007f2550400000)
         libatomic.so.1 => /usr/lib64/libatomic.so.1 (0x00007f2550f1e000)
         librt.so.1 => /lib64/librt.so.1 (0x00007f2550f14000)
         libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f2550eee000)
         libdl.so.2 => /lib64/libdl.so.2 (0x00007f2550ee9000)
         libxpmem.so.0 => /usr/lib64/libxpmem.so.0 (0x00007f2550ee6000)
         libc.so.6 => /lib64/libc.so.6 (0x00007f2550209000)
         libnghttp2.so.14 => /usr/lib64/libnghttp2.so.14 
(0x00007f2550ebd000)
         libidn2.so.0 => /usr/lib64/libidn2.so.0 (0x00007f254fe00000)
         libssh.so.4 => /usr/lib64/libssh.so.4 (0x00007f2550e4c000)
         libpsl.so.5 => /usr/lib64/libpsl.so.5 (0x00007f254fa00000)
         libssl.so.1.1 => /usr/lib64/libssl.so.1.1 (0x00007f2550a15000)
         libcrypto.so.1.1 => /usr/lib64/libcrypto.so.1.1 
(0x00007f254f6c1000)
         libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2 
(0x00007f25507ae000)
         libldap_r-2.4.so.2 => /usr/lib64/libldap_r-2.4.so.2 
(0x00007f2550759000)
         liblber-2.4.so.2 => /usr/lib64/liblber-2.4.so.2 
(0x00007f2550e3a000)
         libzstd.so.1 => /usr/lib64/libzstd.so.1 (0x00007f2550628000)
         libbrotlidec.so.1 => /usr/lib64/libbrotlidec.so.1 
(0x00007f254f400000)
         libz.so.1 => /usr/lib64/libz.so.1 (0x00007f255060f000)
         /lib64/ld-linux-x86-64.so.2 (0x00007f2551354000)
         libunistring.so.2 => /usr/lib64/libunistring.so.2 
(0x00007f254f000000)
         libjitterentropy.so.3 => /usr/lib64/libjitterentropy.so.3 
(0x00007f2550e30000)
         libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x00007f255012f000)
         libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3 
(0x00007f2550118000)
         libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f2550e2b000)
         libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0 
(0x00007f2550109000)
         libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f25500f1000)
         libsasl2.so.3 => /usr/lib64/libsasl2.so.3 (0x00007f25500d3000)
         libbrotlicommon.so.1 => /usr/lib64/libbrotlicommon.so.1 
(0x00007f254ec00000)
         libkeyutils.so.1 => /usr/lib64/libkeyutils.so.1 
(0x00007f254e800000)
         libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f254e400000)
         libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f254e000000)

We are running

🔥 
[caubet_m at login002:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]# 
rpm -qf /usr/lib64/libcxi.so.1
cray-libcxi-0.9-SSHOT2.1.3_20240529150829_3d1dc9246116.x86_64

And we're testing libfabric 1.22.0, I'm open to compiling a newer (or 
older) version if that makes it work, but I'm not sure how far other 
Cray systems have been able to get.

Thanks a lot,

Marc

On 07.10.24 17:59, Raffenetti, Ken wrote:
>
> Hi Marc,
>
> I believe those headers are not necessary for compiling the provider. 
> I proposed removing the checks from configure in 
> https://github.com/ofiwg/libfabric/pull/9793. You could cherry-pick 
> https://github.com/ofiwg/libfabric/pull/9793/commits/5793243aec20c4fee126aa3093ff07bb5889f154 
> and try again.
>
> Ken
>
> *From: *Libfabric-users 
> <libfabric-users-bounces at lists.openfabrics.org> on behalf of Marc 
> Caubet Serrabou <marc.caubet at psi.ch>
> *Date: *Monday, October 7, 2024 at 6:03 AM
> *To: *libfabric-users at lists.openfabrics.org 
> <libfabric-users at lists.openfabrics.org>
> *Subject: *[libfabric-users] CXI libraries present but can not be compiled
>
> Hi, I already opened a ticked to ofiwg@ lists. openfabrics. org, but I 
> also try here in the user list, in case that somebody found a similar 
> issue and has an answer to it. I am trying to compile libfabrics 
> 1. 22. 0 with CXI provider support. Despite
>
> ZjQcmQRYFpfptBannerStart
>
> *This Message Is From an External Sender *
>
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> Hi,
>
> I already opened a ticked to ofiwg at lists.openfabrics.org, but I also 
> try here in the user list, in case that somebody found a similar issue 
> and has an answer to it.
>
> I am trying to compile libfabrics 1.22.0 with CXI provider support. 
> Despite the expected CXI provider header files are present, as well as 
> the CXI library, I get the following errors:
>
> configure: WARNING: The EFA provider requires rdma-core v31 or newer.
> configure: efa provider: disabled
> configure: *** Configuring cxi provider
> checking cxi_prov_hw.h usability... no
> checking cxi_prov_hw.h presence... yes
> configure: WARNING: cxi_prov_hw.h: present but cannot be compiled
> configure: WARNING: cxi_prov_hw.h:     check for missing prerequisite 
> headers?
> configure: WARNING: cxi_prov_hw.h: see the Autoconf documentation
> configure: WARNING: cxi_prov_hw.h:     section "Present But Cannot Be 
> Compiled"
> configure: WARNING: cxi_prov_hw.h: proceeding with the compiler's result
> configure: WARNING:     ## ------------------------------------------ ##
> configure: WARNING:     ## Report this to ofiwg at lists.openfabrics.org ##
> configure: WARNING:     ## ------------------------------------------ ##
> checking for cxi_prov_hw.h... no
> checking uapi/misc/cxi.h usability... no
> checking uapi/misc/cxi.h presence... yes
> configure: WARNING: uapi/misc/cxi.h: present but cannot be compiled
> configure: WARNING: uapi/misc/cxi.h:     check for missing 
> prerequisite headers?
> configure: WARNING: uapi/misc/cxi.h: see the Autoconf documentation
> configure: WARNING: uapi/misc/cxi.h:     section "Present But Cannot 
> Be Compiled"
> configure: WARNING: uapi/misc/cxi.h: proceeding with the compiler's result
> configure: WARNING:     ## ------------------------------------------ ##
> configure: WARNING:     ## Report this to ofiwg at lists.openfabrics.org ##
> configure: WARNING:     ## ------------------------------------------ ##
> checking for uapi/misc/cxi.h... no
> checking libcxi/libcxi.h usability... yes
> checking libcxi/libcxi.h presence... yes
> checking for libcxi/libcxi.h... yes
> configure: looking for library without search path
> checking for cxil_open_device in -lcxi... yes
> checking curl/curl.h usability... yes
> checking curl/curl.h presence... yes
> checking for curl/curl.h... yes
> configure: looking for library without search path
> checking for curl_global_init in -lcurl... yes
> checking json-c/json.h usability... yes
> checking json-c/json.h presence... yes
> checking for json-c/json.h... yes
> configure: looking for library without search path
> checking for json_object_get_type in -ljson-c... yes
> configure: cxi provider: disabled
> configure: WARNING: cxi provider was requested, but cannot be compiled
> configure: error: Cannot continue
> libfabric/1.22.0: configure failed
>
> The libraries are the following and come from Cray, and are in the 
> standard directories (/usr for include files, /usr/lib64 for libraries)
>
> 🔥[caubet_m at login001:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]# 
> rpm -qf /usr/include/uapi/misc/cxi.h /usr/include/cxi_prov_hw.h  
> /usr/lib64/libcxi.so
> warning: Found NDB Packages.db database while attempting bdb backend: 
> using ndb backend.
> cray-cxi-driver-devel-0.9-61.9__g3000a93.SSHOT2.1.3.x86_64
> cray-cassini-headers-user-1.0-SSHOT2.1.3_20240326210855_321db6bd57af.noarch
> cray-libcxi-0.9-SSHOT2.1.3_20240529150829_3d1dc9246116.x86_64
>
> The configure options are the simplest ones, which should enforce CXI 
> only:
>
> /var/tmp/caubet_m/libfabric-1.22.0/src/configure 
> --prefix=/opt/psi/Libraries/libfabric/1.22.0/ --enable-cxi
>
> What am I missing, and how shall I proceed? Is the compilation 
> expecting a different set (or version) of CXI libraries?
>
> Thanks a lot,
>
> Marc
>
> -- 
> _________________________________________________________
> Paul Scherrer Institut
> High Performance Computing & Emerging Technologies
> Marc Caubet Serrabou
> Building/Room: OBBA/230
> Forschungsstrasse, 111
> 5232 Villigen PSI
> Switzerland
> Telephone: +41 765 42 51 24 // +41 56 310 46 67
> E-Mail:marc.caubet at psi.ch

-- 
_________________________________________________________
Paul Scherrer Institut
High Performance Computing & Emerging Technologies
Marc Caubet Serrabou
Building/Room: OBBA/230
Forschungsstrasse, 111
5232 Villigen PSI
Switzerland

Telephone: +41 765 42 51 24 // +41 56 310 46 67
E-Mail:marc.caubet at psi.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20241008/1fff56dc/attachment-0001.htm>


More information about the Libfabric-users mailing list