[openib-general] problems with lustre o2ib module & ofed

Thierry Delaitre delaitt at cpc.wmin.ac.uk
Mon Sep 25 04:49:56 PDT 2006


It seems that lustre puts its modules in /lib/modules/2.6.16.21-0.8-default
despite the fact that my kernel is 2.6.16.21-0.8-smp !

uname -a
Linux n32 2.6.16.21-0.8-smp #4 SMP Sun Sep 24 08:47:30 BST 2006 i686 i686 i386 GNU/Linux

make[3]: Nothing to be done for `install-exec-am'.
/bin/sh ../../mkinstalldirs /lib/modules/2.6.16.21-0.8-default/kernel/fs/lustre
 /usr/bin/install -c -m 644 lquota.ko /lib/modules/2.6.16.21-0.8-default/kernel/fs/lustre/lquota

I therefore ends up with a /lib/modules/2.6.16.21-0.8-smp and
/lib/modules/2.6.16.21-0.8-default

i'm now searching why lustre thinks my kernel is 2.6.16.21-0.8-default and
not 2.6.16.21-0.8-smp

Thierry.

On Mon, 25 Sep 2006, Thierry Delaitre wrote:

>
> On Mon, 25 Sep 2006, Michael S. Tsirkin wrote:
>
> > Quoting r. Thierry Delaitre <delaitt at cpc.wmin.ac.uk>:
> > >
> > > I've set the o2ib path to /usr/local/ofed/src/openib-1.1 as shown in the
> > > lustre's configure line below. Lustre's configure script looks for a
> > > driver/infiniband directory which only seems to exist under
> > > /usr/local/ofed/src/openib-1.1
> > >
> > > ./configure --with-o2ib=/usr/local/ofed/src/openib-1.1/
> > >
> > > Thierry.
> > >
> > > > replace /usr/local/ofed with the prefix you specified.
> >
> > This looks wrong - openib-1.1 is the pristine sources.
> > openib/include is the exported interface and is what you should use
> > for dependent modules.
> > No idea why would lustre need drivers/infiniband.
> > Try creating a softlink:
> >
> > mkdir /usr/local/ofed/src/openib/drivers/infiniband
> > ln -s /usr/local/ofed/src/openib/include /usr/local/ofed/src/openib/drivers/infiniband
>
> I untarred lustre 1.5.95, compiled it (./configure
> --with-o2ib=/usr/local/ofed/src/openib) . did a make install, depmod -a
> and still get the following:
>
> my modprobe.conf is the following
>
> options lnet ip2nets="o2ib0 161.74.83.[0-255]"
>
> lctl network up
> LNET configure error 100: Network is down
>
> ko2iblnd: disagrees about version of symbol ib_create_cq
> ko2iblnd: Unknown symbol ib_create_cq
> ko2iblnd: disagrees about version of symbol ib_dereg_mr
> ko2iblnd: Unknown symbol ib_dereg_mr
> ko2iblnd: disagrees about version of symbol ib_destroy_cq
> ko2iblnd: Unknown symbol ib_destroy_cq
> ko2iblnd: disagrees about version of symbol ib_get_dma_mr
> ko2iblnd: Unknown symbol ib_get_dma_mr
> ko2iblnd: disagrees about version of symbol ib_alloc_pd
> ko2iblnd: Unknown symbol ib_alloc_pd
> ko2iblnd: disagrees about version of symbol ib_modify_qp
> ko2iblnd: Unknown symbol ib_modify_qp
> ko2iblnd: disagrees about version of symbol ib_dealloc_pd
> ko2iblnd: Unknown symbol ib_dealloc_pd
> LustreError: 4177:0:(api-ni.c:1002:lnet_startup_lndnis()) Can't load LND
> o2ib, module ko2iblnd, rc=256
>
> lsmod | grep ib
> libcfs                103060  1 lnet
> ib_ucm                 19332  0
> ib_addr                10756  1 rdma_cm
> ib_cm                  31968  2 ib_ucm,rdma_cm
> ib_ipoib               48400  0
> ib_sa                  16652  3 rdma_cm,ib_cm,ib_ipoib
> ib_uverbs              38312  2 rdma_ucm,ib_ucm
> ib_umad                17968  0
> ib_mthca              116240  0
> ib_mad                 36116  4 ib_cm,ib_sa,ib_umad,ib_mthca
> ib_core                49024  9
> ib_ucm,rdma_cm,ib_cm,ib_ipoib,ib_sa,ib_uverbs,ib_umad,ib_mthca,ib_mad
>
> nm /lib/modules/2.6.16.21-0.8-smp/kernel/drivers/infiniband/core/ib_core.ko | grep ib_alloc_pd
> d5dcb698 A __crc_ib_alloc_pd
> 0000001c r __kcrctab_ib_alloc_pd
> 0000006a r __kstrtab_ib_alloc_pd
> 00000038 r __ksymtab_ib_alloc_pd
> 00000c65 T ib_alloc_pd
>
> from lustre's config.log:
>
> configure:6500: checking whether to enable OpenIB gen2 support
> configure:6586: cp conftest.c build && make modules CC=gcc -f
> /root/lustre-1.5.95/build/Makefile LUSTRE_LINUX
> _CONFIG=/usr/src/linux/.config -o tmp_include_depends -o scripts -o include/config/MARKER -C /usr/src/linux EXTRA_CFLAGS=-Werror-implicit-function-declaration -g -I/root/lustre-1.5.95/lnet/include -I/root/lustre-1.5.95/lustre/include -I/usr/local/ofed/src/openib/include  M=/root/lustre-1.5.95/build
> /root/lustre-1.5.95/build/conftest.c:42: warning: function declaration
> isn't a prototype
> /root/lustre-1.5.95/build/conftest.c: In function 'main':
> /root/lustre-1.5.95/build/conftest.c:49: warning: unused variable 'rej_reason'
> /root/lustre-1.5.95/build/conftest.c:48: warning: unused variable 'pool_fmr'
> /root/lustre-1.5.95/build/conftest.c:47: warning: unused variable 'qp_attr'
> /root/lustre-1.5.95/build/conftest.c:46: warning: unused variable 'device_attr'
> /root/lustre-1.5.95/build/conftest.c:45: warning: unused variable 'conn_param'
> WARNING: "rdma_create_id" [/root/lustre-1.5.95/build/conftest.ko] undefined!
> configure:6589: $? = 0
> configure:6591: test -s build/conftest.o
> configure:6594: $? = 0
> configure:6597: result: yes
>
>
> Thierry.
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
>

----------------------------------------
Dr Thierry DELAITRE
Systems and Services Manager, CSCS
University of Westminster
115 New Cavendish Street, London W1W 6UW

Tel: 020 7911 5000 ext: 3586
Fax: 020 7911 5089
Mobile short dial code 1788

http://www.cscs.wmin.ac.uk/~delaitt
----------------------------------------

This e-mail and its attachments are intended for the above named only
and may be confidential.  If they have come to you in error you must
not copy or show them to anyone, nor should you take any action based
on them, other than to notify the error by replying to the sender.




More information about the general mailing list