[openfabrics-ewg] OFED-1.1-rc2 review comments: openib.spec
Jeff Squyres
jsquyres at cisco.com
Sun Aug 27 06:48:33 PDT 2006
Doug --
This is a most excellent summary -- many thanks! I think that (at least
from my perspective) some of these problems are because RPM usage and
practice is not well documented. "Maximum RPM" is way out of date and at
least I don't know where to find more up-to-date documentation of accepted
practices. So this e-mail is quite helpful in that you're describing
exactly how things should be -- it's essentially up-to-date docs on exactly
what we need. Again -- many thanks for typing it all out!
I think that this e-mail should be broken up into the individual points that
Doug mentions and filed as bugs in the tracker. He brings up several
important points, and it will be easy to lose them all in this lengthy
e-mail. Putting them in the tracker will ensure that they are not lost.
Doug -- would you mind filing bugs about your points?
http://openib.org/bugzilla/
One additional question for you -- how do the Linux vendors typically treat
multiple, equivalent packages? For example, OFED distributes both MVAPICH
and Open MPI. They have executables and libraries that are the same name.
What is the currently accepted practice for distributing them? Just usr
/opt/openmpi and /opt/mvapich as installation bases, and let the user figure
out PATH, LD_LIBRARY_PATH, and MANPATH issues?
On 8/25/06 4:50 PM, "Doug Ledford" <dledford at redhat.com> wrote:
> (A lot of this is going to sound like I'm nit picking, but generally
> speaking, upstream developers then to concern themselves with the actual
> source build files and not rpm spec files. You guys are stepping into
> territory that the engineering people at Red Hat regulate fairly hard in
> terms of style and best practices when you create spec files. Almost
> all of the things I'm going to mention here are things I had to change
> before they would let me submit the openib-1.0 package for our product.
> You obviously don't have to follow suit, but knowing what had to go may
> help your future efforts in this area and OS vendor requirements line up
> better. Also, if you are wondering why I didn't bring some of this up
> before, they didn't make me go through a formal package review until we
> were talking about RHEL5beta packages, and some of this was new to me at
> that time. As a kernel engineer I don't normally have to deal with
> packages, the kernel rpm maintainer does.)
>
> You have both SOURCES/openib-1.1.tgz and SRPMS/openib-1.1-0.src.rpm that
> contains openib-1.1.tgz. You only need the tgz in one place.
>
> Default rpm configuration for prefix and libdir is a violation of Linux
> File Hierachy Standard. You're acting as a distributor of an official
> software package, you should have your binaries in /usr/bin (unless they
> are needed for single user maintenance operation, then /bin)
> and /usr/sbin or /sbin. Even though the distribution may be built on
> the customer's machine, building a prepackaged set of files is not the
> same as a system admin checking out a svn repo and compiling things
> himself and installing from the compile tree. The /usr/local area is
> generally reserved for that latter class of actions. At most, you
> should be installing under /opt/%{name} if you don't feel comfortable
> installing to the normal system directories. However, as a matter of
> practicality, since various vendors (Red Hat, SuSE) *are* shipping
> versions of this same code, you really *should* be installing to the
> official FHS directories so that conflicting installations of this
> software and vendor provided software will be detected by the package
> management application (aka, you *want* your packages and Red Hat/SuSE
> packages to conflict with each other or else you may find that random,
> unexplained problems related to multiple versions of software on a
> system starts to become an issue).
>
> To comply with the FHS, system daemons should not be in /usr/bin, they
> should be in /usr/sbin instead. Opensmd and sldd.sh come to mind as
> things that are primarily used in a daemon capacity and shouldn't really
> be in user's path. Any other applications that require elevated
> privileges to operate should go there as well.
>
> Spec files should only build for one arch at a time. If you need to
> build for another arch, your should rerun the build process using
> --target <arch> as needed. If you do this on, for example, x86_64 to
> build i386, you get a complete set of i386 and x86_64 rpms, from which
> you then install the x86_64 binaries, and both the x86_64 and i386 libs,
> and the x86_64 devel packages. At that point, it's all good. This
> implies that you will need to either A) control compilation from the %
> build script in the rpm spec file (what I do) or B) pass the
> RPM_BUILD_FLAGS to your build script and then pass them on to make so
> that things like -m32 get picked up and the binaries get built as the
> proper type. Likely you will also need to pass the build flags to the
> configure actions as well.
>
> The package should build the desired result without being passed any
> options (build systems often don't allow you to specify options to the
> build process, ours doesn't anyway, so it has to build "out of the
> box").
>
> Although you may choose to leave it in your spec, the Vendor: flag is
> really intended to be filled in by the people that build and distribute
> the binaries. As such, Red Hat's build system doesn't allow it to exist
> because it adds it automatically.
>
> This is a big one, and it's going to cause upgrade path issues. You
> guys *CAN'T* define a static release on all the packages inside
> openib.spec independently. Furthermore, you *CAN'T* have different
> builds floating around with the same release numbers. You have zero
> ability to track what actual code someone is running right now with the
> way you are doing package n-v-r numbers in your spec file. If someone
> builds the openib rpm, then changes the spec file and adds a patch, and
> rebuilds the rpm, all the subpackages come out with the same name as the
> first build, even if you passed an updated version option to rpmbuild.
> That's ultra broken.
>
> Your selected release numbering on some packages will break upgrades
> later. For example, the release on openib-diags is rc2. If you are
> going to put any rc or pre or beta tags in the release, it should always
> be as 0.<rctag>.<buildnumber>. That way the first build would be
> 0.rc2.1 and if you change it later and rebuild it would be 0.rc2.2 and
> when you have a final release instead of a release candidate, you
> eliminate the tag and update the release number to 1 and that preserves
> the proper rpm upgrade behavior across all the different build variants.
> Without doing this, when you finally release the final version of the
> code, rpm won't upgrade to it because the match between rc2 and 1 won't
> compare the way you might expect.
>
> Since you have a single, large source rpm that builds lots of seemingly
> unrelated subrpms when looking just at names, I would recommend you use
> the openib name in more of the subpackage names. For instance,
> openib-perftest is more descriptive than just perftest, and less likely
> to cause people to install it only to go "Oh, I don't have infiniband
> hardware, never mind". Likewise with tvflash, mstflint, and several
> others.
>
> Multiple rpms are missing proper Requires: definitions. These can be
> caught by working with clean machine installs. For example, if you
> install the opensm-devel package in order to build ibutils, you'll find
> that the opensm headers directly include headers from libibcommon-devel
> and libibumad-devel without bothering to specify a Requires:
> libibcommon-devel, libibumad-devel in the opensm-devel package
> definition.
>
> When splitting binaries out from libraries (needed for multilib setups),
> either the libs should require the binaries or the binaries the libs.
> If there is no reason to have the libs without the binaries, then it's
> typical to have the base package (say opensm) use a Requires:
> opensm-libs = %{version}-%{release}. This is to avoid upgrade skew
> between libraries and associated binaries. The same is done for -devel
> packages so that the devel package and the libs are always in sync. So,
> as an example, this is the opensm definition in the spec file I used:
>
> %package -n opensm
> Version: 1.2.0
> Summary: InfiniBand subnet manager and administration libraries
> Requires: opensm-libs = %{version}-%{release}
> Group: System Environment/Libraries
> %description -n opensm
> OpenSM provides an implementation for an InfiniBand Subnet Manager and
> Administration. Such a software entity is required to run for in order
> to initialize the InfiniBand hardware (at least one per each
> InfiniBand subnet).
>
> %package -n opensm-libs
> Version: 1.2.0
> Summary: InfiniBand subnet manager and administration libraries
> Requires: openib
> Group: System Environment/Libraries
> %description -n opensm-libs
> OpenSM provides an implementation for an InfiniBand Subnet Manager and
> Administration. Such a software entity is required to run for in order
> to initialize the InfiniBand hardware (at least one per each
> InfiniBand subnet).
>
> %package -n opensm-devel
> Version: 1.2.0
> Summary: Development files for OpenSM
> Group: System Environment/Libraries
> Requires: opensm-libs = %{version}-%{release}, libibcommon-devel,
> libibumad-devel
> %description -n opensm-devel
> Static libraries and header files for OpenSM.
>
> Note: this is a perfect example of the sort of thing that is totally
> broken by using static Release: numbers in the spec file, you could
> easily get a devel package and libs using different actual ABIs that
> way.
>
> The spec file should have functioning %prep, %build, and %install
> procedures that properly separate these three distinct steps.
>
> You have multiple config files spread in different places in /etc. For
> our stuff, I was asked to clean that up, so they all got consolidated
> under /etc/ofed so that the naming would be equally applicable to all
> the component files that were there.
>
> librdmacm and libibcm both used non-versioned library file names. Since
> that was flagged, I used the function version information in the
> respective .map files to version the actual library files on the
> filesystem, so libibcm is -version-info 4:0:0 and librdmacm is
> -version-info 1:0:0.
>
> As an example of why unversioned filenames for libraries is *bad*, the
> ofed-1.0 distribution builds a libibcm.so and relies upon the
> information in libibcm.map to version the symbols for the linker. But,
> due to this typo in libibcm/Makefile.am (which still exists in 1.1rc2):
>
> if HAVE_LD_VERSION_SCRIPT
> ibcm_version_script = -Wl,--version-script=$(srcdir)/src/libibcm.map
> ^^^^
> else
>
> ...
>
> src_libibcm_la_LDFLAGS = -avoid-version $(ucm_version_script)
> ^^^
>
> (someone made a cut and past error I think) you now have a libibcm.so
> floating around with absolutely 0 symbol versioning and no filename
> versioning as well. Bad.
>
> As an example of why having a monolithic install script that doesn't
> split things into steps, the openib tarball preran aclocal and automake
> in multiple of the user space library directories (actually, all but
> libehca I think, so I special cased that one) meaning that patching the
> Makefile.am above won't correct the final problem because changes to the
> Makefile.am file are ignored. So, what you have to do in order to fix
> the bug without respinning the tarball (I'm supposed to keep a clean
> upstream tarball, so I shouldn't be respinning it myself) is to run the
> top level configure and then patch up the resulting Makefile instead.
> Rpm is already designed to handle this type of thing and does it well,
> versus using the monolithic install script each and every occurrence of
> anything like this is a special case edit that has to be patched into
> the script.
>
> When adhering the the FHS, there is no appropriate place to put the
> uninstall.sh script (although, more importantly, you should be using rpm
> to control the install/uninstall of the files and this shouldn't be on
> the fs at all).
>
> libsysfs-2.0 and up does away with sysfs_read_attribute_value(), so if
> you are going to continue to use sysfs on later installations, you need
> to update code to use sysfs_open_attribute/sysfs_read_attribute instead
> which exists both on 1.x and 2.0 applications.
>
> As a side note: if you are going to make different kernels put a sysfs
> attribute in different places (reference the librdmacm discussion on
> list recently), you need to make the library check *both* places, not
> just one. You don't want the library tied to a specific kernel hack and
> vice versa. The library should work regardless of which kernel variant
> it's on. A better solution though would be to only have the file in one
> place, and if it isn't there because of an older kernel, assume V1 and
> continue since the later V2 stuff is in kernels that have the class/misc
> available to the rdmacm kernel code, generally speaking.
>
> OK, that pretty much covers it. I know this was long. It took me 3
> days of back and forth with the package reviewer to get the initial
> openib-1.0-3.el5 package approved. FWIW, ibutils and openmpi breezed
> through in comparison with mainly just making them FHS compliant.
>
> --
> Doug Ledford <dledford at redhat.com>
> GPG KeyID: CFBFF194
> http://people.redhat.com/dledford
>
> Infiniband specific RPMs available at
> http://people.redhat.com/dledford/Infiniband
> _______________________________________________
> openfabrics-ewg mailing list
> openfabrics-ewg at openib.org
> http://openib.org/mailman/listinfo/openfabrics-ewg
--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
More information about the ewg
mailing list