[openib-general] [OOPS] user-mode verbs

Woodruff, Robert J robert.j.woodruff at intel.com
Wed May 4 16:11:09 PDT 2005


 Hi Roland,

I have started to stress test the user-mode verbs code using
uDAPL and Intel MPI running multiple copies of the Pallas MPI 
benchmark simultaniously. 

This is on a RedHat EL4.0 2.6.9ELsmp kernel, 
SVN 2245 backported to the RedHat kernel,
EM64T 2.4 Ghz Xeon boxes
PCI-E HCAs

Note that last night I ran the code on a straight 2.6.9 kernel from
kernel.org
with the infiniband backport, but in UP mode and it ran all 
night without any issues.

woody


Here are two separate oops traces.

general protection fault: 0000 [1] SMP 
CPU 0 
Modules linked in: nfsd(U) exportfs(U) lockd(U) det(U) ib_uverbs(U)
ib_sdp(U) ib_cm(U) ib_ipoib(U) ib_sa(U) md5(U) ipv6(U) parport_pc(U)
lp(U) parport(U) autofs4(U) i2c_dev(U) i2c_core(U) sunrpc(U) dm_mod(U)
button(U) battery(U) ac(U) uhci_hcd(U) ehci_hcd(U) hw_random(U)
ib_mthca(U) ib_mad(U) ib_core(U) e1000(U) floppy(U) ext3(U) jbd(U)
Pid: 4942, comm: PMB-MPI1 Tainted: P      2.6.9-prep
RIP: 0010:[<ffffffffa01ab199>]
<ffffffffa01ab199>{:ib_uverbs:ib_uverbs_destroy_qp+187}
RSP: 0018:000001012ea45ea8  EFLAGS: 00010046
RAX: 0000000000000010 RBX: 0000010131678400 RCX: 000001000000e000
RDX: 37de000000000000 RSI: 37de000000000010 RDI: 00000101382ddf38
RBP: 000001013218ca80 R08: 000001012eca5668 R09: 0000000000000246
R10: 000000000000ea60 R11: 0000000000000246 R12: ffffffffa01af100
R13: 0000000000000000 R14: 000000000000000c R15: 0000000000000000
FS:  0000002a959015a0(0000) GS:ffffffff804bf580(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000002a95b4f002 CR3: 0000000000101000 CR4: 00000000000006e0
Process PMB-MPI1 (pid: 4942, threadinfo 000001012ea44000, task
00000101303a37f0)
Stack: 000000030000001d 000000000000001d 000000000000000c
0000007fbffff190 
       000001013218ca80 000000000000000c 0000007fbffff190
ffffffffa01a957d 
       0000000000000000 000000030000001d 
Call Trace:<ffffffffa01a957d>{:ib_uverbs:ib_uverbs_write+137} 
       <ffffffff80172088>{vfs_write+207}
<ffffffff80172170>{sys_write+69} 
       <ffffffff8010ffd2>{system_call+126} 

Code: 48 8b 42 10 48 8b 4e 08 48 89 48 08 48 89 01 48 c7 46 08 00 
RIP <ffffffffa01ab199>{:ib_uverbs:ib_uverbs_destroy_qp+187} RSP
<000001012ea45ea8>
 

And the second one,

errfatal[7715]: segfault at 0000000000000008 rip 0000002a955c9e30 rsp
0000007fbffff530 error 4
Unable to handle kernel NULL pointer dereference at 0000000000000028
RIP: 
<ffffffffa01ab372>{:ib_uverbs:__ib_umem_unmark+5}
PML4 12d90f067 PGD 12f247067 PMD 0 
Oops: 0000 [1] SMP 
CPU 0 
Modules linked in: nfsd(U) exportfs(U) lockd(U) det(U) ib_uverbs(U)
ib_sdp(U) ib_cm(U) ib_ipoib(U) ib_sa(U) md5(U) ipv6(U) parport_pc(U)
lp(U) parport(U) autofs4(U) i2c_dev(U) i2c_core(U) sunrpc(U) dm_mod(U)
button(U) battery(U) ac(U) uhci_hcd(U) ehci_hcd(U) hw_random(U)
ib_mthca(U) ib_mad(U) ib_core(U) e1000(U) floppy(U) ext3(U) jbd(U)
Pid: 11522, comm: PMB-MPI1 Tainted: P      2.6.9-prep
RIP: 0010:[<ffffffffa01ab372>]
<ffffffffa01ab372>{:ib_uverbs:__ib_umem_unmark+5}
RSP: 0018:000001012397be68  EFLAGS: 00010216
RAX: 000001013db82940 RBX: 000001013db82998 RCX: 000001012b3abd10
RDX: 0000000000000000 RSI: 000001013db82940 RDI: 0000000000000028
RBP: 000001013db82940 R08: 000001012bbda2c8 R09: 000000000000000a
R10: 000000000000ea60 R11: 000001012397be6f R12: 0000000000000028
R13: 000001013d9a2000 R14: 000000000000000c R15: 0000000000000018
FS:  0000002a959015a0(0000) GS:ffffffff804bf580(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000028 CR3: 0000000000101000 CR4: 00000000000006e0
Process PMB-MPI1 (pid: 11522, threadinfo 000001012397a000, task
00000101212df7f0)
Stack: 000001012bbda2c8 ffffffffa01ab6a1 000001013fca98c0
0000000000000000 
       000001013fca98c0 ffffffffa01af100 0000000000000000
ffffffffa01aa747 
       000001013fca98c0 000000d03fca98c0 
Call Trace:<ffffffffa01ab6a1>{:ib_uverbs:ib_umem_release+77}
<ffffffffa01aa747>{:ib_uverbs:ib_uverbs_dereg_mr+263} 
       <ffffffffa01a957d>{:ib_uverbs:ib_uverbs_write+137} 
       <ffffffff80172088>{vfs_write+207}
<ffffffff80172170>{sys_write+69} 
       <ffffffff8010ffd2>{system_call+126} 

Code: 48 8b 37 48 89 c7 e8 6b 8b fb df 5e c3 41 57 49 89 d7 41 56 
RIP <ffffffffa01ab372>{:ib_uverbs:__ib_umem_unmark+5} RSP
<000001012397be68>
CR2: 0000000000000028
 



More information about the general mailing list