[openib-general] Open MPI rpmbuild fails in OFED-1.2

Jeff Squyres jsquyres at cisco.com
Thu Feb 8 06:14:59 PST 2007


On Feb 8, 2007, at 8:43 AM, Michael S. Tsirkin wrote:

>> 2) we're trying to *use* the software when it is installed in the
>> DESTDIR
>> --> this means that you have to put special-case in the software so
>> that they look for support files in both the DESTDIR *and* the final
>> installation directory
>
> How do you mean, use?

The easiest example is that MPI wrapper compilers are used to compile  
the MPI test suites (mpicc, etc.).  This means that OMPI's libraries  
and support files (e.g., help files, wrapper compiler data files)  
need to be found, even though they're not in their final installation  
locations.

> Hmm. I guess my question is - this works fine when I run OFED's
> configure script, why is SRPM so much more difficult?

It's not the single SRPM that is the problem.  We've had an OMPI SRPM  
that works fine for a long, long time.  A single DESTDIR build is no  
problem, especially for an Autoconf/Automake/Libtool-based project  
like Open MPI.

The problems are:

- libibverbs and other support libraries are in the DESTDIR when OMPI  
is built (but eventually will move).  So OMPI has to rpath *BOTH*  
locations for libibverbs (i.e., the DESTDIR and the final  
installdir), one of which will be a lie.  God help you if you're  
trying to build OFED on a machine where a previous version of OFED is  
installed -- i.e., where libibverbs exists in *BOTH* the DESTDIR and  
the final prefix!  (this specific problem actually caused me to waste  
a few hours while developing the new OMPI build stuff in build.sh  
last week)

- I didn't look closely at the OFED 1.2 build scripts yet, but we ran  
into problems during the development of OFED 1.1 where dependent OFA  
libraries needed to link to each other.  In OFED, those links were  
simply removed because of the whole DESTDIR/installdir duality.  This  
actually caused problems in some scenarios.  IIRC, the one I remember  
is that the link between libmthca and libibverbs was effectively  
removed by removing AC_CHECK_LIB from libmthca's configure.ac (recall  
that mthca uses some of the public symbols in libibverbs) because  
AC_CHECK_LIB was looking in the installdir.  That may not be 100%  
right -- I don't recall all the details.

- we *use* OMPI in the DESTDIR (and MVAPICH), as described above.  I  
had to  add a patch to the upcoming OMPI v1.2 community release to  
first examine the environment and look for a specific variable to re- 
root all of the compiled-in directories (it's too late in the OMPI  
v1.2 release process to put this patch in the official v1.2  
release).  What a pain.  :-\

So the OMPI path issue is resolvable (at the cost of adding a whole  
pile of code to OMPI), but the rpath issue is not.  Once you link an  
app, its rpaths are fixed and you can't change them based on an  
environment variable.  Hence, the only solution is to rpath *both*  
directories, but even that has problems and ambiguities (as described  
above).  In fairness, we could tell the user to set LD_LIBRARY_PATH,  
but no one seems to want to do that (users always screw it up, and it  
becomes problematic for rsh/ssh-based scenarios).

All this plus the fact that we're clearly going outside of what the  
SuSE RPM developers intend just indicates to me that this doesn't  
seem Right...

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems





More information about the general mailing list