[ofa-general] Re: InfiniBand card (mthca) in Linux

Lukas Hejtmanek xhejtman at ics.muni.cz
Sat Jul 7 01:53:03 PDT 2007


On Thu, Jul 05, 2007 at 03:22:12PM -0700, Roland Dreier wrote:
> Loading and unloading ib_mthca many times works fine on a non-Xen
> system.  So there is something different about the Xen environment
> that is causing a problem.  It could be a bug in mthca exposed by Xen
> (eg improper use of of the DMA mapping API or something like that).
> 
> Can you turn on all the memory debugging options like SLAB_DEBUG
> etc. and see if it turns up anything?

Well, I turned on slab debug, vm debug and mthca debug. The output is below.
Anything interesting in it?

# insmod ib_mthca.ko debug_level=1
ib_mthca: Mellanox InfiniBand HCA driver v0.08 (February 14, 2006)
ib_mthca: Initializing 0000:08:00.0
PCI: Enabling device 0000:08:00.0 (0000 -> 0002)
Slab corruption: start=ffff880098f513b8, len=256
Redzone: 0x1600000016/0x1700000017.
Last user: <0000001800000018>(0x1800000018)
000: 17 00 00 00 17 00 00 00 18 00 00 00 18 00 00 00
010: 19 00 00 00 19 00 00 00 1a 00 00 00 1a 00 00 00
020: 1b 00 00 00 1b 00 00 00 1c 00 00 00 1c 00 00 00
030: 1d 00 00 00 1d 00 00 00 1e 00 00 00 1e 00 00 00
040: 1f 00 00 00 1f 00 00 00 00 00 00 00 00 00 00 00
050: 01 00 00 00 01 00 00 00 02 00 00 00 02 00 00 00
Prev obj: start=0000000398f5120b, len=256
Unable to handle kernel paging request at 0000000398f5130b RIP: 
 <ffffffff80277313> print_objinfo+0x22/0xde
PGD 9b0a1067 PUD 0 
Oops: 0000 1 SMP 
CPU 0 
Modules linked in: ib_mthca nfs lockd nfs_acl sunrpc ib_ipoib ib_cm ib_sa ib_mad ib_core memtrack ipv6 e1000 dm_mod parport_pc lp parport xfs ata_piix ahci piix mptsas mptscsih mptbase scsi_transport_sas raid0 sata_nv libata amd74xx sd_mod scsi_mod ide_disk ide_core
Pid: 2193, comm: insmod Not tainted 2.6.18-xen31-smp #6
RIP: e030:<ffffffff80277313>  <ffffffff80277313> print_objinfo+0x22/0xde
RSP: e02b:ffff88009acfd8c8  EFLAGS: 00010206
RAX: 0000000398f5130b RBX: 00000000008bd8c1 RCX: ffffffffff57c000
RDX: 0000000000000002 RSI: 0000000398f51203 RDI: ffff8800015f20c0
RBP: ffff8800015f20c0 R08: ffff88009ae9e3c8 R09: 00000000000035eb
R10: ffff88009acfd818 R11: ffffffff802fd0b5 R12: 0000000398f51203
R13: 0000000000000002 R14: ffff880098f513b0 R15: ffff880098f51000
FS:  00002aaaaadedb00(0000) GS:ffffffff804aa000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process insmod (pid: 2193, threadinfo ffff88009acfc000, task ffff88009c3a1080)
Stack:  00000000008bd8c1 ffff8800015f20c0 0000000398f51203 0000000000000100
 ffff880098f513b0 ffffffff80277521 ffff8800015f20c0 0000000000000000
 ffff8800015f20c0 ffff880098f513b0 ffffffff88318ece 00000000000000d0
Call Trace:
 <ffffffff80277521> check_poison_obj+0x152/0x1ae
 <ffffffff88318ece> :ib_mthca:mthca_alloc_icm+0xff/0x35c
 <ffffffff88318ece> :ib_mthca:mthca_alloc_icm+0xff/0x35c
 <ffffffff80278269> cache_alloc_debugcheck_after+0x34/0x1b0
 <ffffffff802784d7> kmem_cache_alloc+0xf2/0x102
 <ffffffff88318ece> :ib_mthca:mthca_alloc_icm+0xff/0x35c
 <ffffffff88319263> :ib_mthca:mthca_alloc_icm_table+0x138/0x227
 <ffffffff88307bab> :ib_mthca:mthca_init_hca+0x5ee/0xde7
 <ffffffff802bb44d> sysfs_add_file+0x77/0x86
 <ffffffff803228d9> device_create_file+0x31/0x39
 <ffffffff883088d3> :ib_mthca:__mthca_init_one+0x52f/0xb50
 <ffffffff80277073> poison_obj+0x24/0x2d
 <ffffffff88308f6a> :ib_mthca:mthca_init_one+0x76/0x8b
 <ffffffff802f54df> pci_device_probe+0x4a/0x70
 <ffffffff80324481> driver_probe_device+0x52/0xa8
 <ffffffff803245ac> __driver_attach+0x6b/0xa9
 <ffffffff80324541> __driver_attach+0x0/0xa9
 <ffffffff803239c2> bus_for_each_dev+0x43/0x6e
 <ffffffff80323d04> bus_add_driver+0x73/0x10f
 <ffffffff802f50f7> __pci_register_driver+0x57/0x7e
 <ffffffff88186193> :ib_mthca:mthca_init+0x135/0x148
 <ffffffff802478ce> sys_init_module+0x16e1/0x180a
 <ffffffff802099da> system_call+0x86/0x8b
 <ffffffff80209954> system_call+0x0/0x8b


Code: 48 8b 18 48 89 ef e8 11 fd ff ff 48 8b 30 48 c7 c7 da c3 3e

-- 
Lukáš Hejtmánek



More information about the general mailing list