[libfabric-users] CXI libraries present but can not be compiled
Marc Caubet Serrabou
marc.caubet at psi.ch
Tue Oct 8 01:34:40 PDT 2024
Hi Ken,
Thanks a lot for your answer. I just tried to cherry-pick your commit
into the v1.22.0 tag, but then the compilation crashes for a different
reason:
copying selected object files to avoid basename conflicts...
CCLD util/fi_strerror
CCLD util/fi_info
CCLD util/fi_pingpong
CCLD prov/cxi/test/multinode/test_frmwk
CCLD prov/cxi/test/multinode/test_zbcoll
CCLD prov/cxi/test/multinode/test_coll
CCLD prov/cxi/test/multinode/test_barrier
/usr/bin/ld: src/.libs/libfabric.so: undefined reference to `cxi_cq_empty'
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:12746: util/fi_strerror] Error 1
make[1]: *** Waiting for unfinished jobs....
/usr/bin/ld: src/.libs/libfabric.so: undefined reference to `cxi_cq_empty'
collect2: error: ld returned 1 exit status
/usr/bin/ld: src/.libs/libfabric.so: undefined reference to `cxi_cq_empty'
make[1]: *** [Makefile:12740: util/fi_pingpong] Error 1
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:12734: util/fi_info] Error 1
/usr/bin/ld: src/.libs/libfabric.a(src_libfabric_la-cxip_dom.o): in
function `cxip_domain_find_cmdq':
cxip_dom.c:(.text+0x436): undefined reference to `cxi_cq_empty'
/usr/bin/ld: src/.libs/libfabric.a(src_libfabric_la-cxip_dom.o): in
function `cxip_domain_find_cmdq':
cxip_dom.c:(.text+0x436): undefined reference to `cxi_cq_empty'
/usr/bin/ld: src/.libs/libfabric.a(src_libfabric_la-cxip_dom.o): in
function `cxip_domain_find_cmdq':
cxip_dom.c:(.text+0x436): undefined reference to `cxi_cq_empty'
/usr/bin/ld: src/.libs/libfabric.a(src_libfabric_la-cxip_dom.o): in
function `cxip_domain_find_cmdq':
cxip_dom.c:(.text+0x436): undefined reference to `cxi_cq_empty'
collect2: error: ld returned 1 exit status
collect2: error: ld returned 1 exit status
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:12658: prov/cxi/test/multinode/test_zbcoll] Error 1
make[1]: *** [Makefile:12648: prov/cxi/test/multinode/test_frmwk] Error 1
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:12638: prov/cxi/test/multinode/test_coll] Error 1
make[1]: *** [Makefile:12628: prov/cxi/test/multinode/test_barrier] Error 1
make[1]: Leaving directory '/var/tmp/caubet_m/libfabric-1.22.0/src'
make: *** [Makefile:6816: all] Error 2
libfabric/1.22.0: compilation failed!
Any ideas?
On the other hand, I also tried another proposed update from
https://github.com/thomasgillis/libfabric/tree/dev-cxi, and with that
one I can compile correctly, but then something is wrong:
🔥
[caubet_m at login002:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]#
fi_info -p cxi
fi_getinfo: -61 (No data available)
🔥
[caubet_m at login002:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]#
ldd $(which fi_info) | grep cxi
libcxi.so.1 => /usr/lib64/libcxi.so.1 (0x00007f122b977000)
🔥
[caubet_m at login002:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]#
ldd $(which fi_info)
linux-vdso.so.1 (0x00007ffcf52f6000)
libfabric.so.1 =>
/opt/psi/Libraries/libfabric/1.22.0/lib64/libfabric.so.1
(0x00007f255102a000)
libcxi.so.1 => /usr/lib64/libcxi.so.1 (0x00007f2551004000)
libnl-3.so.200 => /usr/lib64/libnl-3.so.200 (0x00007f2550c00000)
libcurl.so.4 => /usr/lib64/libcurl.so.4 (0x00007f2550f5a000)
libjson-c.so.3 => /usr/lib64/libjson-c.so.3 (0x00007f2550800000)
libm.so.6 => /lib64/libm.so.6 (0x00007f2550ab4000)
libuuid.so.1 => /usr/lib64/libuuid.so.1 (0x00007f2550f28000)
libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x00007f2550400000)
libatomic.so.1 => /usr/lib64/libatomic.so.1 (0x00007f2550f1e000)
librt.so.1 => /lib64/librt.so.1 (0x00007f2550f14000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f2550eee000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f2550ee9000)
libxpmem.so.0 => /usr/lib64/libxpmem.so.0 (0x00007f2550ee6000)
libc.so.6 => /lib64/libc.so.6 (0x00007f2550209000)
libnghttp2.so.14 => /usr/lib64/libnghttp2.so.14
(0x00007f2550ebd000)
libidn2.so.0 => /usr/lib64/libidn2.so.0 (0x00007f254fe00000)
libssh.so.4 => /usr/lib64/libssh.so.4 (0x00007f2550e4c000)
libpsl.so.5 => /usr/lib64/libpsl.so.5 (0x00007f254fa00000)
libssl.so.1.1 => /usr/lib64/libssl.so.1.1 (0x00007f2550a15000)
libcrypto.so.1.1 => /usr/lib64/libcrypto.so.1.1
(0x00007f254f6c1000)
libgssapi_krb5.so.2 => /usr/lib64/libgssapi_krb5.so.2
(0x00007f25507ae000)
libldap_r-2.4.so.2 => /usr/lib64/libldap_r-2.4.so.2
(0x00007f2550759000)
liblber-2.4.so.2 => /usr/lib64/liblber-2.4.so.2
(0x00007f2550e3a000)
libzstd.so.1 => /usr/lib64/libzstd.so.1 (0x00007f2550628000)
libbrotlidec.so.1 => /usr/lib64/libbrotlidec.so.1
(0x00007f254f400000)
libz.so.1 => /usr/lib64/libz.so.1 (0x00007f255060f000)
/lib64/ld-linux-x86-64.so.2 (0x00007f2551354000)
libunistring.so.2 => /usr/lib64/libunistring.so.2
(0x00007f254f000000)
libjitterentropy.so.3 => /usr/lib64/libjitterentropy.so.3
(0x00007f2550e30000)
libkrb5.so.3 => /usr/lib64/libkrb5.so.3 (0x00007f255012f000)
libk5crypto.so.3 => /usr/lib64/libk5crypto.so.3
(0x00007f2550118000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f2550e2b000)
libkrb5support.so.0 => /usr/lib64/libkrb5support.so.0
(0x00007f2550109000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f25500f1000)
libsasl2.so.3 => /usr/lib64/libsasl2.so.3 (0x00007f25500d3000)
libbrotlicommon.so.1 => /usr/lib64/libbrotlicommon.so.1
(0x00007f254ec00000)
libkeyutils.so.1 => /usr/lib64/libkeyutils.so.1
(0x00007f254e800000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f254e400000)
libpcre.so.1 => /usr/lib64/libpcre.so.1 (0x00007f254e000000)
We are running
🔥
[caubet_m at login002:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]#
rpm -qf /usr/lib64/libcxi.so.1
cray-libcxi-0.9-SSHOT2.1.3_20240529150829_3d1dc9246116.x86_64
And we're testing libfabric 1.22.0, I'm open to compiling a newer (or
older) version if that makes it work, but I'm not sure how far other
Cray systems have been able to get.
Thanks a lot,
Marc
On 07.10.24 17:59, Raffenetti, Ken wrote:
>
> Hi Marc,
>
> I believe those headers are not necessary for compiling the provider.
> I proposed removing the checks from configure in
> https://github.com/ofiwg/libfabric/pull/9793. You could cherry-pick
> https://github.com/ofiwg/libfabric/pull/9793/commits/5793243aec20c4fee126aa3093ff07bb5889f154
> and try again.
>
> Ken
>
> *From: *Libfabric-users
> <libfabric-users-bounces at lists.openfabrics.org> on behalf of Marc
> Caubet Serrabou <marc.caubet at psi.ch>
> *Date: *Monday, October 7, 2024 at 6:03 AM
> *To: *libfabric-users at lists.openfabrics.org
> <libfabric-users at lists.openfabrics.org>
> *Subject: *[libfabric-users] CXI libraries present but can not be compiled
>
> Hi, I already opened a ticked to ofiwg@ lists. openfabrics. org, but I
> also try here in the user list, in case that somebody found a similar
> issue and has an answer to it. I am trying to compile libfabrics
> 1. 22. 0 with CXI provider support. Despite
>
> ZjQcmQRYFpfptBannerStart
>
> *This Message Is From an External Sender *
>
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> Hi,
>
> I already opened a ticked to ofiwg at lists.openfabrics.org, but I also
> try here in the user list, in case that somebody found a similar issue
> and has an answer to it.
>
> I am trying to compile libfabrics 1.22.0 with CXI provider support.
> Despite the expected CXI provider header files are present, as well as
> the CXI library, I get the following errors:
>
> configure: WARNING: The EFA provider requires rdma-core v31 or newer.
> configure: efa provider: disabled
> configure: *** Configuring cxi provider
> checking cxi_prov_hw.h usability... no
> checking cxi_prov_hw.h presence... yes
> configure: WARNING: cxi_prov_hw.h: present but cannot be compiled
> configure: WARNING: cxi_prov_hw.h: check for missing prerequisite
> headers?
> configure: WARNING: cxi_prov_hw.h: see the Autoconf documentation
> configure: WARNING: cxi_prov_hw.h: section "Present But Cannot Be
> Compiled"
> configure: WARNING: cxi_prov_hw.h: proceeding with the compiler's result
> configure: WARNING: ## ------------------------------------------ ##
> configure: WARNING: ## Report this to ofiwg at lists.openfabrics.org ##
> configure: WARNING: ## ------------------------------------------ ##
> checking for cxi_prov_hw.h... no
> checking uapi/misc/cxi.h usability... no
> checking uapi/misc/cxi.h presence... yes
> configure: WARNING: uapi/misc/cxi.h: present but cannot be compiled
> configure: WARNING: uapi/misc/cxi.h: check for missing
> prerequisite headers?
> configure: WARNING: uapi/misc/cxi.h: see the Autoconf documentation
> configure: WARNING: uapi/misc/cxi.h: section "Present But Cannot
> Be Compiled"
> configure: WARNING: uapi/misc/cxi.h: proceeding with the compiler's result
> configure: WARNING: ## ------------------------------------------ ##
> configure: WARNING: ## Report this to ofiwg at lists.openfabrics.org ##
> configure: WARNING: ## ------------------------------------------ ##
> checking for uapi/misc/cxi.h... no
> checking libcxi/libcxi.h usability... yes
> checking libcxi/libcxi.h presence... yes
> checking for libcxi/libcxi.h... yes
> configure: looking for library without search path
> checking for cxil_open_device in -lcxi... yes
> checking curl/curl.h usability... yes
> checking curl/curl.h presence... yes
> checking for curl/curl.h... yes
> configure: looking for library without search path
> checking for curl_global_init in -lcurl... yes
> checking json-c/json.h usability... yes
> checking json-c/json.h presence... yes
> checking for json-c/json.h... yes
> configure: looking for library without search path
> checking for json_object_get_type in -ljson-c... yes
> configure: cxi provider: disabled
> configure: WARNING: cxi provider was requested, but cannot be compiled
> configure: error: Cannot continue
> libfabric/1.22.0: configure failed
>
> The libraries are the following and come from Cray, and are in the
> standard directories (/usr for include files, /usr/lib64 for libraries)
>
> 🔥[caubet_m at login001:~/git/buildblocks/Libraries/libfabric(ofi_1.22.0)]#
> rpm -qf /usr/include/uapi/misc/cxi.h /usr/include/cxi_prov_hw.h
> /usr/lib64/libcxi.so
> warning: Found NDB Packages.db database while attempting bdb backend:
> using ndb backend.
> cray-cxi-driver-devel-0.9-61.9__g3000a93.SSHOT2.1.3.x86_64
> cray-cassini-headers-user-1.0-SSHOT2.1.3_20240326210855_321db6bd57af.noarch
> cray-libcxi-0.9-SSHOT2.1.3_20240529150829_3d1dc9246116.x86_64
>
> The configure options are the simplest ones, which should enforce CXI
> only:
>
> /var/tmp/caubet_m/libfabric-1.22.0/src/configure
> --prefix=/opt/psi/Libraries/libfabric/1.22.0/ --enable-cxi
>
> What am I missing, and how shall I proceed? Is the compilation
> expecting a different set (or version) of CXI libraries?
>
> Thanks a lot,
>
> Marc
>
> --
> _________________________________________________________
> Paul Scherrer Institut
> High Performance Computing & Emerging Technologies
> Marc Caubet Serrabou
> Building/Room: OBBA/230
> Forschungsstrasse, 111
> 5232 Villigen PSI
> Switzerland
> Telephone: +41 765 42 51 24 // +41 56 310 46 67
> E-Mail:marc.caubet at psi.ch
--
_________________________________________________________
Paul Scherrer Institut
High Performance Computing & Emerging Technologies
Marc Caubet Serrabou
Building/Room: OBBA/230
Forschungsstrasse, 111
5232 Villigen PSI
Switzerland
Telephone: +41 765 42 51 24 // +41 56 310 46 67
E-Mail:marc.caubet at psi.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/libfabric-users/attachments/20241008/1fff56dc/attachment-0001.htm>
More information about the Libfabric-users
mailing list