[openib-general] OFED embedded in CentOS 4.4 doesn't work

zhang chao zc_kyo at hotmail.com
Thu Jan 18 21:30:14 PST 2007


Hi, openib maillist:

I have a cluster installed CentOS 4.4 -- which embedded the OFED 
packages(under /usr/ofed directory). All infiniband drivers and libraries 
have been installed and I have configured IPoIB, it also works well. The 
OFED version is 1.0 -- maybe.

Now I am trying to install MVAPICH so that I can run my MPI applications 
over Infiniband. The MVAPICH's version is 0.9.8 -- the latest stable 
version. I modified make.mvapich.gen2 script_(set the IBHOME to /usr/ofed, 
and set the IBHOMELIB to /usr/ofed/lib64, this directory contains 
libibverbs.so, libibcommon.so....., etc.), the installation was successful 
(MVAPICH recognized my HCA adapter -- Mellonox PCI-Express SDR, and it 
seems that there were no errors during configure, make and install).

Then I wrote a simple mpihello.c program to verify the installation. This 
program just printf "helloworld" in every process. I used mpicc to compile 
it and when I run it, the problem occurs:

[eric at cfx1 testcodes]$ /usr/local/mvapich/bin/mpirun -np 4 -hostfile 
hostfile2 mpihello
libibverbs: Warning: couldn't load driver 
/usr/ofed/lib64/infiniband/libopensm.so: 
/usr/ofed/lib64/infiniband/libopensm.so: undefined symbol: ib_error_str
libibverbs: Warning: couldn't load driver 
/usr/ofed/lib64/infiniband/libopensm.so: 
/usr/ofed/lib64/infiniband/libopensm.so: undefined symbol: ib_error_str
libibverbs: Warning: couldn't load driver 
/usr/ofed/lib64/infiniband/libopensm.so: 
/usr/ofed/lib64/infiniband/libopensm.so: undefined symbol: ib_error_str
libibverbs: Warning: couldn't load driver 
/usr/ofed/lib64/infiniband/libopensm.so: 
/usr/ofed/lib64/infiniband/libopensm.so: undefined symbol: ib_error_str
libibverbs: Warning: couldn't load driver 
/usr/ofed/lib64/infiniband/libosmcomp-1.2.1.so: 
/usr/ofed/lib64/infiniband/libosmcomp-1.2.1.so: undefined symbol: osm_log
libibverbs: Warning: couldn't load driver 
/usr/ofed/lib64/infiniband/libosmcomp.so: 
/usr/ofed/lib64/infiniband/libosmcomp.so: undefined symbol: osm_log
libibverbs: Warning: couldn't load driver 
/usr/ofed/lib64/infiniband/libosmvendor-1.2.1.so: 
/usr/ofed/lib64/infiniband/libosmvendor-1.2.1.so: undefined symbol: 
ib_error_str
libibverbs: Warning: couldn't load driver 
/usr/ofed/lib64/infiniband/libosmvendor.so: 
/usr/ofed/lib64/infiniband/libosmvendor.so: undefined symbol: ib_error_str
libibverbs: Warning: couldn't load driver 
/usr/ofed/lib64/infiniband/libosmvendor_openib.so: 
/usr/ofed/lib64/infiniband/libosmvendor_openib.so: undefined symbol: 
ib_error_str
mpirun: executable version 1 does not match our version 3.
done.

I have two questions here:

1. Why the libibverbs wants to find the libraries in 
/usr/ofed/lib64/infiniband directory? The libraries are under 
/usr/ofed/lib64 directory but I still copied all the libraries files
into the /usr/ofed/lib64/infiniband, whereas the problems still exist.

2. What does the error messages list above mean? How to solve it? I have 
also tried the command: /usr/local/mvapich/bin/mpirun_rsh -np 4 -hostfile 
./hostfile2 ./mpihello , this also cannot be executed, the error message is 
the same.

Thanks. Any suggestions are greatly appreciated.

Eric
2006-01-19

_________________________________________________________________
享用世界上最大的电子邮件系统― MSN Hotmail。  http://www.hotmail.com  





More information about the general mailing list