[openib-general] segmentation fault with ibv_pingpong
Sayantan Sur
surs at cse.ohio-state.edu
Tue Apr 12 16:28:05 PDT 2005
Hi,
I am facing a segmentation fault problem with the OpenIB Gen2 drivers
while executing `ibv_pingpong' test. The description of the problem is
given below. Can someone point out what may be going wrong here? I have
included as much information as I thought would be required, but if
more specific information is needed, I can provide it.
Thanks,
Sayantan.
Hardware:
---------
Two Dual Intel Xeon EM64T 3.4 GHz nodes
PCI-Express I/O bus
MT25208 Mellanox HCAs (rev a0)
Software:
---------
RedHat AS 4
2.6.11.6/2.6.11.7 kernel with Gen2 InfiniBand drivers
Firmware version 5.0.1
OpenIB Gen2 drivers (user verbs from main branch)
OpenSM (OpenIB version/IBGD 1.7.0 both of them result in the same)
Both the machines display their ports as ACTIVE.
[surs at x1:~] cat /sys/class/infiniband/mthca0/ports/1/state
4: ACTIVE
[surs at x5:bin] lsmod | grep ib
ib_uverbs 28056 0
ib_umad 17696 0
ib_mthca 113952 0
ib_mad 38576 2 ib_umad,ib_mthca
ib_core 52352 4 ib_uverbs,ib_umad,ib_mthca,ib_mad
libata 53000 1 ata_piix
scsi_mod 151888 3 libata,aic79xx,sd_mod
Now, if I try to run ibv_pingpong, I get this error:
--->
[surs at x1:~] ibv_pingpong
Segmentation fault
[surs at x1:~]
Message from syslogd at x1 at Mon Apr 11 18:37:18 2005 ...
x1 kernel: invalid operand: 0000 [1] SMP
<---
The relevant part from the kernel log:
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at pci_gart:537
invalid operand: 0000 [1] SMP
CPU 0
Modules linked in: ib_uverbs ib_umad ib_mthca ib_mad ib_core parport_pc
lp parport autofs4 nfs lockd sunrpc dm_mod video button battery ac md5
ipv6 uhci_hcd ehci_hcd hw_random i2c_i801 i2c_core e1000 floppy ext3 jbd
ata_piix libata aic79xx sd_mod scsi_mod
Pid: 4034, comm: ibv_pingpong Not tainted 2.6.11.6
RIP: 0010:[<ffffffff8011da86>] <ffffffff8011da86>{dma_map_sg+223}
RSP: 0018:ffff81001d4dfd58 EFLAGS: 00010246
RAX: 000000001b92b000 RBX: ffff81001764fbf8 RCX: 000000001b92b000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000ff0 R11: 0000000000000246 R12: ffff81001764f000
R13: ffff81001764fbf8 R14: 0000000000000001 R15: ffff81001f92c070
FS: 00002aaaaacca000(0000) GS:ffffffff804c6380(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000515000 CR3: 0000000013145000 CR4: 00000000000006e0
Process ibv_pingpong (pid: 4034, threadinfo ffff81001d4de000, task
ffff81001dc641f0)
Stack: ffff81001dc641f0 0000000000000000 0000000100000000
ffff81001764fbd0
ffff8100176f3000 ffff81001764f000 0000000000513000
0000000000000001
ffff81001facfa40 ffffffff8824b621
Call Trace:<ffffffff8824b621>{:ib_mthca:mthca_map_user_db+366}
<ffffffff88249f54>{:ib_mthca:mthca_create_cq+115}
<ffffffff880f38f6>{:ib_uverbs:ib_uverbs_create_cq+165}
<ffffffff880f2608>{:ib_uverbs:ib_uverbs_write+139}
<ffffffff8017443b>{vfs_write+207}
<ffffffff8017454a>{sys_write+69}
<ffffffff8010e29e>{system_call+126}
Code: 0f 0b 0f 91 32 80 ff ff ff ff 19 02 89 f8 49 8b 97 08 01 00
RIP <ffffffff8011da86>{dma_map_sg+223} RSP <ffff81001d4dfd58>
----------------------------------------------------------------
Now, if I try to run ibv_pingpong under gdb (sender side), I get it it
to progress a little bit more (but not to completion). The receiver
prints this now:
<---
[surs at x5:examples] ibv_pingpong 192.168.107.2
local address: LID 0x0002, QPN 0x000404, PSN 0x104788
remote address: LID 0x0001, QPN 0x000404, PSN 0x08b81e
[ 0] 00000404
[ 4] b3000000
[ 8] fd000003
[ c] 110000c0
[10] 15810000
[14] 00000010
[18] 00008002
[1c] ff100000
Failed status 12 for wr_id 2
--->
--
---------------------------------------------------------
Sayantan Sur Graduate Research Assistant
395 Dreese Labs, Computer Science and Engineering
Ohio State University, Office : 774, Dreese Labs
Columbus, email : surs at cse.ohio-state.edu
Ohio - 43210. phone(res) : 614.688.9792
USA. phone(off) : 614.292.8501
---------------------------------------------------------
More information about the general
mailing list