[openib-general] [Bug 263] New: OFED 1.1 rc6: IPoIB Oops during IPoIB failover loop
bugzilla-daemon at openib.org
bugzilla-daemon at openib.org
Tue Oct 3 22:47:54 PDT 2006
http://openib.org/bugzilla/show_bug.cgi?id=263
Summary: OFED 1.1 rc6: IPoIB Oops during IPoIB failover loop
Product: OpenFabrics Linux
Version: 1.1rc6
Platform: X86-64
OS/Version: SLES 10
Status: NEW
Severity: major
Priority: P2
Component: IPoIB
AssignedTo: bugzilla at openib.org
ReportedBy: sweitzen at cisco.com
SLES10 x86_64 with dual-port LionCub HCA.
I am looping a script that turns off and back on IB ports on a Cisco IB switch
such that there will be IPoIB failover every 30 seconds on a host, and I'm
running IPoIB traffic on that host too.
If I fail back and forth between ib0 and ib1 every 30 seconds or so for several
hours, while IPoIB traffic is running, IPoIB host gets an Oops: and IPoIB stops
working.
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib1: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
general protection fault: 0000 [1] SMP
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
CPU 7
Modules linked in: af_packet ib_sdp rdma_ucm rdma_cm ib_addr ib_cm ib_ipoib
ib_s
a ib_uverbs ib_umad ib_mthca ib_mad ib_core nls_utf8 st ipv6 nfs lockd nfs_acl
s
unrpc button battery ac apparmor aamatch_pcre loop usbhid dm_mod hw_random
ide_c
d ehci_hcd uhci_hcd cdrom i8xx_tco ide_floppy usbcore shpchp e1000 pci_hotplug
f
loppy reiserfs edd fan thermal processor siimage sg mptspi mptscsih mptbase
scsi
_transport_spi piix sd_mod scsi_mod ide_disk ide_core
Pid: 23541, comm: ib_mad1 Tainted: G U 2.6.16.21-0.8-smp #1
RIP: 0010:[<ffffffff802cffea>] <ffffffff802cffea>{_spin_lock_irqsave+3}
RSP: 0018:ffff810132a4fc20 EFLAGS: 00010086
RAX: 0000000000000286 RBX: 0000000000000000 RCX: ffffffff883324ee
RDX: ffff810128d5e380 RSI: 0000000000000000 RDI: 0000ffff1b6017ff
RBP: 00000000fffffffc R08: ffffffff803d3260 R09: ffff810140333800
R10: ffff81000107d400 R11: 0000000000000292 R12: ffff810128d5e380
R13: ffff810132a4fc78 R14: 0000ffff1b6017ff R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff810142d19740(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002b0b5e6ae180 CR3: 0000000128cbc000 CR4: 00000000000006e0
Process ib_mad1 (pid: 23541, threadinfo ffff810132a4e000, task
ffff810142b56100)
Stack: ffffffff8833c5f5 ffff8101302b3000 0000ffff1b6012ff 0000000000000002
0000000000000296 ffff8101302b3500 ffffffff8027753e ffff810128d5e3a0
ffff81012bce1680 ffff810128d5e380
Call Trace: <ffffffff8833c5f5>{:ib_ipoib:path_rec_completion+862}
<ffffffff8027753e>{dev_queue_xmit+545}
<ffffffff8833c5b2>{:ib_ipoib:path_
rec_completion+795}
<ffffffff8833252e>{:ib_sa:ib_sa_path_rec_callback+64}
<ffffffff80138f17>{lock_timer_base+27}
<ffffffff80138f89>{try_to_del_time
r_sync+81}
<ffffffff883322b3>{:ib_sa:send_handler+72}
<ffffffff8826762f>{:ib_mad:ib_
mad_complete_send_wr+421}
<ffffffff88267f00>{:ib_mad:ib_mad_completion_handler+947}
<ffffffff88267b4d>{:ib_mad:ib_mad_completion_handler+0}
<ffffffff80140177>{run_workqueue+153}
<ffffffff8014081e>{worker_thread+0}
<ffffffff801437e5>{keventd_create_kthread+0}
<ffffffff80140927>{worker_th
read+265}
<ffffffff8012787f>{__wake_up_common+62}
<ffffffff8012905a>{default_wake_f
unction+0}
<ffffffff801437e5>{keventd_create_kthread+0}
<ffffffff80143aca>{kthread+2
36}
<ffffffff8010b60a>{child_rip+8}
<ffffffff801437e5>{keventd_create_kthread
+0}
<ffffffff801439de>{kthread+0} <ffffffff8010b602>{child_rip+0}
Code: f0 ff 0f 0f 88 29 01 00 00 c3 fa f0 ff 0f 0f 88 2a 01 00 00
RIP <ffffffff802cffea>{_spin_lock_irqsave+3} RSP <ffff810132a4fc20>
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the general
mailing list