[ofa-general] Compiled IB packages

Christopher Tanner christopher.tanner at gatech.edu
Wed Sep 10 06:59:45 PDT 2008


Vladimir -

Good catch on the linux headers version - I fixed that now. The  
problem persisted after fixing the headers... but I finally figured  
out what the issues were. On the configure line:

a) the --kernel-sources option needs the path to the linux HEADERS  
(linux-headers-<ver>), not the linux SOURCE (linux-source-<ver>).  
Terminology there is confusing...

b) If I didn't specify anything for the --modules-dir option, it  
defaults to /lib/modules/2.6.24-16-server/updates. I don't know what  
the 'updates' gets appended onto the end, but that is not correct. So  
I had to specify --modules-dir=/lib/modules/2.6.24-16-server

It compiled and installed just fine!

My final question - how do I install the kernel modules on the rest of  
the nodes? The source was compiled in the /home directory, which is  
shared to all nodes via NFS. However, the kernel headers are NOT  
shared to the rest of the nodes. Do you recommend I:

a) Install the linux headers on all of the nodes and execute 'make  
install' on all nodes
b) Look at where the modules installed to (from the make install  
output) and copy the files manually

Thanks!

-------------------------------------------
Chris Tanner
Space Systems Design Lab
Georgia Institute of Technology
christopher.tanner at gatech.edu
-------------------------------------------



On Sep 10, 2008, at 3:28 AM, Vladimir Sokolovsky wrote:

> Hi,
>> From the log file, I see the mismatch between the sources you are
> passing to configure command and autoconf.h/auto.conf below:
>
> /usr/src/linux-headers-2.6.24-19-generic/include/linux/autoconf.h
> /usr/src/linux-headers-2.6.24-19-generic/include/config/auto.conf
>
>> From the log file:
> 	Kernel version: 2.6.24-16-server
> 	Modules directory: //lib/modules/2.6.24-16-server/updates
> 	Kernel sources: /usr/src/linux-source-2.6.24
>
> Check that you have corresponding (matching the running kernel)
> linux-headers package installed and then you don't have to pass
> --kernel-sources and --kernel parameters to the configure script.
>
> E.g.
> for kernel 2.6.24-19-generic it is linux-headers-2.6.24-19-generic
>
> Regards,
> Vladimir
>
> On Tue, 2008-09-09 at 14:53 -0400, Christopher Tanner wrote:
>> Thanks Vladimir - very helpful. However, I'm running into a problem
>> with compiling the ofa package. First, I had to specify the source
>> location on the command line (Ubuntu puts it in a different place  
>> than
>> RedHat or SUSE):
>>
>> $ ./configure --kernel-sources=/usr/src/linux-source-2.6.24 ...  
>> (other
>> stuff)
>>
>> I'm getting this error:
>>
>>   ERROR: Kernel configuration is invalid.
>>          include/linux/autoconf.h or include/config/auto.conf are
>> missing.
>>          Run 'make oldconfig && make prepare' on kernel src to fix  
>> it.
>>
>> This is confusing b/c both of those files exist.
>> $ locate autoconf.h
>> /usr/src/linux-headers-2.6.24-19-generic/include/linux/autoconf.h
>>
>> $ locate auto.conf
>> /usr/src/linux-headers-2.6.24-19-generic/include/config/auto.conf
>>
>> There's a whole bunch more errors that I assume spawn because of this
>> initial error. The output from 'make' is attached (it's pretty long).
>> Let me know what you think. Thanks!
>>
>> -------------------------------------------
>> Chris Tanner
>> Space Systems Design Lab
>> Georgia Institute of Technology
>> christopher.tanner at gatech.edu
>> -------------------------------------------
>>
>>
>>
>> On Sep 9, 2008, at 11:03 AM, Vladimir Sokolovsky wrote:
>>
>>> Christopher Tanner wrote:
>>>> I am setting up a 16-node (homogeneous) cluster running Ubuntu 8.04
>>>> server with Mellanox Infiniband cards. I downloaded (from the
>>>> OpenFabrics website), compiled, and installed the following IB
>>>> packages on the master node into the /usr/local/lib directory.  
>>>> The /
>>>> usr/local directory is being shared to all of the nodes via NFS.
>>>> All packages seemed to compile and install fine.
>>>> libibverbs
>>>> librdmacm
>>>> libibcm
>>>> libipathverbs
>>>> dapl
>>>> compat-dapl
>>>> libmlx4
>>>> libmthca
>>>> libcxgb3
>>>> libibcommon
>>>> libibumad
>>>> libibmad
>>>> opensm
>>>> infiniband-diags
>>>> I have a few questions:
>>>> a) Do I need to run 'make install' on each node or just the master
>>>> node? All of the libraries in /usr/local/lib are visible to all
>>>> nodes... Stated another way, does 'make install' put files
>>>> elsewhere beside the /usr/local/lib directory? Does it alter OS
>>>> configuration files to tell it to look for certain files in /usr/
>>>> local/lib?
>>>
>>> No, all the packages above will put their files under /usr/local
>>>
>>>> b)  I know I need to load the IB kernel modules (mlx4_core,
>>>> mlx4_ib, rdma_ucm, ib_core, ib_mad, ib_mthca, ib_umad, ib_uverbs)
>>>> in order for the IB cards to work. Are these compiled and installed
>>>> with the above packages? Where does the kernel know where to look
>>>> for modules? (Sorry, this question is very similar to the first  
>>>> one).
>>>
>>> The packages above are user space libraries/binaries. To install
>>> kernel
>>> modules you should download the latest version of the ofa_1_4_kernel
>>> tgz file from:
>>>
>>> http://www.openfabrics.org/downloads/ofa_1_4_kernel/
>>> To install, run:
>>> ./configure --with-core-mod --with-user_mad-mod --with-user_access-
>>> mod --with-addr_trans-mod --with-mthca-mod --with-mthca_debug-mod --
>>> with-mlx4-mod --with-mlx4_en-mod --with-mlx4_debug-mod --with-cxgb3-
>>> mod --with-ehca-mod --with-ipoib-mod --with-ipoib_debug-mod (... ,
>>> see --help)
>>> make
>>> make install
>>>
>>>
>>>> c) The OFED software stack contains some stuff that isn't available
>>>> for source download (e.g. ib-bonding, ibsim, libsdp). Are these
>>>> necessary for the IB network to operate correctly? Since I'm
>>>> running Ubuntu, obviously the src.rpm file won't work...
>>>
>>> All OFED tgz files that are available under:
>>> http://www.openfabrics.org/~vlad/ofed_1_4/SOURCES/
>>>
>>> ib-bonding source RPM can be downloaded from (you can open it to get
>>> tgz file using cpio, if you need):
>>> http://www.openfabrics.org/~monis/ofed_1_4/
>>>
>>> This packages are not necessary for the IB network to operate
>>> correctly, but
>>> it depends on what are you planning to do.
>>>
>>> Regards,
>>> Vladimir
>>>
>>>> Thanks to all for you help. Previous responses regarding issues
>>>> with OpenSM worked great.
>>>> -------------------------------------------
>>>> Chris Tanner
>>>> Space Systems Design Lab
>>>> Georgia Institute of Technology
>>>> christopher.tanner at gatech.edu
>>>> -------------------------------------------
>>




More information about the general mailing list