[ofa-general] Re: InfiniBand card (mthca) in Linux

Roland Dreier rdreier at cisco.com
Sat Jul 7 16:24:16 PDT 2007


 > Slab corruption: start=ffff880098f513b8, len=256
 > Redzone: 0x1600000016/0x1700000017.
 > Last user: <0000001800000018>(0x1800000018)

OK, CONFIG_DEBUG_SLAB is catching a slab getting corrupted with a
really strange pattern of incrementing values up to 1f.  Somehow
running under Xen is triggering this, since I run mthca with
CONFIG_DEBUG_SLAB set all the time and I've never seen anything like
this happen.

 > Call Trace:
 >  <ffffffff80277521> check_poison_obj+0x152/0x1ae
 >  <ffffffff88318ece> :ib_mthca:mthca_alloc_icm+0xff/0x35c
 >  <ffffffff88318ece> :ib_mthca:mthca_alloc_icm+0xff/0x35c
 >  <ffffffff80278269> cache_alloc_debugcheck_after+0x34/0x1b0
 >  <ffffffff802784d7> kmem_cache_alloc+0xf2/0x102
 >  <ffffffff88318ece> :ib_mthca:mthca_alloc_icm+0xff/0x35c
 >  <ffffffff88319263> :ib_mthca:mthca_alloc_icm_table+0x138/0x227
 >  <ffffffff88307bab> :ib_mthca:mthca_init_hca+0x5ee/0xde7

seems something bad is happening in mthca_alloc_icm, although the
corruption may have been earlier.

But I don't understand how we could have reached mthca_alloc_icm()
without getting through mthca_QUERY_FW and printing the FW version
first... are you sure you're getting all the trace messages?  How are
you collecting them?  Can you make sure that your console level is set
so that you see messages printed with KERN_DEBUG?

 - R.



More information about the general mailing list