[ofa-general] problem with rdma_ucm in OpenSuSE 10.2 default kernel
Joe Landman
landman at scalableinformatics.com
Sun Jul 8 08:37:30 PDT 2007
After getting it to build correctly, installing it, and configuring it,
I am getting a crash in rdma_ucm. That and for some reason, there is a
dependency upon ipv6.ko which depmod doesn't pick up. The latter is
solvable easily, but the former is troubling. Here is the snippet from
the messages file
> Jul 8 11:08:30 jackrabbit kernel: ----------- [cut here ] --------- [please bite here ] ---------
> Jul 8 11:08:30 jackrabbit kernel: Kernel BUG at fs/sysfs/file.c:473
> Jul 8 11:08:30 jackrabbit kernel: invalid opcode: 0000 [1] SMP
> Jul 8 11:08:30 jackrabbit kernel: last sysfs file: /class/net/ib0/mode
> Jul 8 11:08:30 jackrabbit kernel: CPU 3
> Jul 8 11:08:30 jackrabbit kernel: Modules linked in: rdma_ucm ib_sdp rdma_cm iw_cm ib_addr ib_local_sa ib_ipoib ipv6 snd_pcm_oss s
> nd_mixer_oss ib_uverbs snd_seq ib_umad snd_seq_device ib_cm ib_sa cpufreq_conservative cpufreq_ondemand cpufreq_userspace cpufreq_p
> owersave powernow_k8 freq_table button battery ac ipmi_si ipmi_devintf ipmi_msghandler apparmor aamatch_pcre ext3 jbd mbcache loop
> dm_mod usbhid usb_storage snd_hda_intel snd_hda_codec snd_pcm snd_timer ib_mthca snd shpchp ehci_hcd ib_mad ohci_hcd ohci1394 ib_co
> re soundcore pci_hotplug ide_cd i2c_nforce2 ieee1394 forcedeth cdrom snd_page_alloc usbcore i2c_core xfs edd fan sg arcmsr sata_nv
> libata amd74xx thermal processor sd_mod scsi_mod ide_disk ide_core
> Jul 8 11:08:30 jackrabbit kernel: Pid: 5464, comm: modprobe Tainted: G U 2.6.18.2-34-default #1
> Jul 8 11:08:30 jackrabbit kernel: RIP: 0010:[<ffffffff802eaeb1>] [<ffffffff802eaeb1>] sysfs_create_file+0x19/0x31
> Jul 8 11:08:30 jackrabbit kernel: RSP: 0000:ffff81042171de50 EFLAGS: 00010202
> Jul 8 11:08:30 jackrabbit kernel: RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff803eddf8
> Jul 8 11:08:30 jackrabbit kernel: RDX: 0000000000000000 RSI: ffffffff8856d720 RDI: ffff8104274f3810
> Jul 8 11:08:30 jackrabbit kernel: RBP: ffff810423e8c000 R08: ffffffff804d83b8 R09: ffff810424bb7b80
> Jul 8 11:08:30 jackrabbit kernel: R10: 0000000000000022 R11: ffff810424bb7b80 R12: ffff810423e8c5c0
> Jul 8 11:08:30 jackrabbit kernel: R13: ffffffff8856d900 R14: ffff810423e8c558 R15: ffffc20000a87e48
> Jul 8 11:08:30 jackrabbit kernel: FS: 00002b5c9772f6f0(0000) GS:ffff810428f7a9c0(0000) knlGS:0000000000000000
> Jul 8 11:08:30 jackrabbit kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jul 8 11:08:30 jackrabbit kernel: CR2: 000000000062f007 CR3: 0000000226d4a000 CR4: 00000000000006e0
> Jul 8 11:08:30 jackrabbit kernel: Process modprobe (pid: 5464, threadinfo ffff81042171c000, task ffff8104288e3830)
> Jul 8 11:08:30 jackrabbit kernel: Stack: ffffffff881a1026 ffffffff8856d900 ffffffff80299bcc 0000000000000019
> Jul 8 11:08:30 jackrabbit kernel: 0000000000000000 000000002171de78 0000000000000000 0000000000000000
> Jul 8 11:08:30 jackrabbit kernel: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> Jul 8 11:08:30 jackrabbit kernel: Call Trace:
> Jul 8 11:08:30 jackrabbit kernel: [<ffffffff881a1026>] :rdma_ucm:ucma_init+0x26/0x4a
> Jul 8 11:08:30 jackrabbit kernel: [<ffffffff80299bcc>] sys_init_module+0x172f/0x18e5
> Jul 8 11:08:30 jackrabbit kernel: [<ffffffff8025800e>] system_call+0x7e/0x83
> Jul 8 11:08:30 jackrabbit kernel:
> Jul 8 11:08:30 jackrabbit kernel:
> Jul 8 11:08:30 jackrabbit kernel: Code: 0f 0b 68 b8 75 40 80 c2 d9 01 48 8b 7f 48 ba 04 00 00 00 e9
> Jul 8 11:08:30 jackrabbit kernel: RIP [<ffffffff802eaeb1>] sysfs_create_file+0x19/0x31
> Jul 8 11:08:30 jackrabbit kernel: RSP <ffff81042171de50>
> Jul 8 11:08:30 jackrabbit kernel: <6>ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready
> Jul 8 11:08:36 jackrabbit kernel: eth0: no IPv6 routers present
> Jul 8 11:08:40 jackrabbit kernel: ib0: no IPv6 routers present
I bring ipoib for testing (pinging) hosts, as well as having some of the
ssh traffic cross it. Sometimes quite useful.
Is the above a known problem? Should I file a bug report? The tainted
kernel is likely due to the arcmsr driver, though it is open source, so
I am not sure what is "tainted" about it.
Joe
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the general
mailing list