[ofa-general] IB function calls in kernel module fail

neutron neutronsharc at gmail.com
Sun Feb 15 14:40:36 PST 2009


Hi all,

I'm writing a kernel module that make use of basic IB verbs to
communicate, like:
ib_register_client,  ib_unregister_client,  ib_alloc_pd,
ib_create_qp,  ib_reg_phys_mr,  etc.

I can compile the code into a kernel module:  ib_rdma_lat.ko.   This
module is to test the RDMA write latency from kernel module.

But when I "insmod", I got error reports at /var/log/messages:

Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_unregister_client
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_unregister_client
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_create_cq
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_create_cq
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_reg_phys_mr
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_reg_phys_mr
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_dereg_mr
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_dereg_mr
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_register_client
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_register_client
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_destroy_cq
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_destroy_cq
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_query_port
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_query_port
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_alloc_pd
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_alloc_pd
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_create_qp
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_create_qp
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_modify_qp
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_modify_qp
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_destroy_qp
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_destroy_qp
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: disagrees about version of
symbol ib_dealloc_pd
Feb 15 16:33:28 wci11 kernel: ib_rdma_lat: Unknown symbol ib_dealloc_pd

I'm running rhel5.  I have rebooted the node many times but didn't
help at all.

[wci11-oib:~/dist_lock/ib_kernel]uname -a
Linux wci11-oib 2.6.18-53.1.14.el5 #1 SMP Tue Feb 19 07:18:46 EST 2008
x86_64 x86_64 x86_64 GNU/Linux


"ofed_info" is:
[wci11-oib:~/dist_lock/ib_kernel]/usr/bin/ofed_info
OFED-1.3.1
libibverbs:
git://git.openfabrics.org/ofed_1_3/libibverbs.git ofed_1_3
commit 40b771aa6a9c0ad092b2e20775b4723d3b173792
libmthca:
git://git.openfabrics.org/ofed_1_3/libmthca.git ofed_1_3
commit 9501e698d257949acfab2edc90812602966dbcc9
libmlx4:
git://git.openfabrics.org/ofed_1_3/libmlx4.git ofed_1_3
......


I'm pretty sure all IB modules are loaded already:
[wci11-oib:~/dist_lock/ib_kernel]lsmod | grep ib
ib_sdp                125020  0
rdma_cm                67348  2 rdma_ucm,ib_sdp
ib_addr                41992  1 rdma_cm
ib_ipoib              113248  0
ib_cm                  67368  3 qlgc_vnic,rdma_cm,ib_ipoib
ib_sa                  74632  4 qlgc_vnic,rdma_cm,ib_ipoib,ib_cm
ib_uverbs              75568  1 rdma_ucm
ib_umad                50600  0
ib_ipath              346316  0
mlx4_ib                95932  0
mlx4_core             109008  1 mlx4_ib
ib_mthca              159044  0
ib_mad                 70948  5 ib_cm,ib_sa,ib_umad,mlx4_ib,ib_mthca
ib_core                97664  15
rdma_ucm,qlgc_vnic,ib_sdp,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_sa,ib_uverbs,ib_umad,iw_cxgb3,ib_ipath,mlx4_ib,ib_mthca,ib_mad
libiscsi               61952  1 iscsi_tcp
scsi_transport_iscsi    67344  3 iscsi_tcp,libiscsi
ipoib_helper           35728  2 ib_ipoib
ipv6                  411425  43 ib_ipoib
libata                160849  1 ata_piix
scsi_mod              186361  6
iscsi_tcp,libiscsi,scsi_transport_iscsi,sg,libata,sd_mod


"service openibd status" reports the status is OK:
[wci11-oib:~/dist_lock/ib_kernel]sudo service openibd status

  HCA driver loaded

Configured devices:
ib0 ib1 ib2 ib3

Currently active devices:
ib0
ib2

The following OFED modules are loaded:

  rdma_ucm
  qlgc_vnic
  ib_sdp
  rdma_cm
  ib_addr
  ib_ipoib
  ib_ipath
  mlx4_core
  mlx4_ib
  ib_mthca
  ib_uverbs
  ib_umad
  ib_sa
  ib_cm
  ib_mad
  ib_core
  iw_cxgb3


I have no idea what's going on.    Any suggestions?



More information about the general mailing list