[openib-general] wiki update - howto for Chelsio's T3 RNIC
Steve Wise
swise at opengridcomputing.com
Fri Jul 14 06:54:12 PDT 2006
Erf. I see the bug. SCE (stewpid coding error :). The patch below
will solve the crash, but the issue really is that we're failing to
allocate our data structures. To work around this for now, I suggest
you patch the cxgb3 src code to reduce the number of supported objects
in core/cxio_hal.h. Namely T3_MAX_NUM_RI, T3_MAX_NUM_QP, etc...
For the crash try this:
----
Index: iwch.c
===================================================================
--- iwch.c (revision 8481)
+++ iwch.c (working copy)
@@ -65,7 +65,8 @@
static inline void *vzmalloc(int size)
{
void *p = vmalloc(size);
- memset(p, 0, size);
+ if (p)
+ memset(p, 0, size);
return p;
}
Steve.
> [root at iblp0044 ~]# modprobe iw_cxgb3
> Segmentation fault
>
> [root at iblp0044 ~]# dmesg
> ...
> <snip>
> ...
> eth2: Chelsio T320 2x10000BaseX RNIC (rev 0) PCI-X 133MHz/64-bit MSI-X
> eth2: 128MB CM, 256MB PMTX, 256MB PMRX
> eth3: Chelsio T320 2x10000BaseX RNIC (rev 0) PCI-X 133MHz/64-bit MSI-X
> Unable to handle kernel paging request at virtual address 0000000000248000
> modprobe[3348]: Oops 8804682956800 [1]
> Modules linked in: iw_cxgb3 cxgb3c ib_umad ib_ucm ib_uverbs ib_sa ib_cm ib_mad ib_core cxgb3
>
> Pid: 3348, CPU 0, comm: modprobe
> psr : 00001010081a6018 ifs : 8000000000000183 ip : [<a0000001003148e0>] Not tainted
> ip is at memset+0x240/0x420
> unat: 0000000000000000 pfs : 0000000000000593 rsc : 0000000000000003
> rnat: 0000002080000000 bsps: ffffffff80000000 pr : 0000000005550519
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
> csd : 0000000000000000 ssd : 0000000000000000
> b0 : a00000020022f540 b6 : a0000002000d44e0 b7 : a0000002000b9f00
> f6 : 1003e0000000000000000 f7 : 1003e6db6db6db6db6db7
> f8 : 1003e00000000071b5ed6 f9 : 1003e0000000000000000
> f10 : 1003e0000000000020000 f11 : 1003e0000000000000000
> r1 : a000000100c80920 r2 : e0000040f98034f8 r3 : a000000100a99f98
> r8 : 0000000000248000 r9 : 6db6db6db6db6db7 r10 : 00000000071b5ed6
> r11 : 0000000038daf6b0 r12 : e000004043897e30 r13 : e000004043890000
> r14 : e0000040f98034f8 r15 : 00000040fa3e8000 r16 : a000000200248000
> r17 : a0007fffc7200000 r18 : a000000100a9c220 r19 : a000000100a9c220
> r20 : 00000040fffb8000 r21 : 0000000000000010 r22 : 0000000000000800
> r23 : 0000000000000007 r24 : 0000000000248000 r25 : 00000040fc020000
> r26 : a000000202248000 r27 : 0000000000248010 r28 : 0000000000288000
> r29 : e0000040fc020000 r30 : 0000000000000000 r31 : 00000000000007ff
>
> Call Trace:
> [<a000000100010b50>] show_stack+0x50/0xa0
> sp=e0000040438979c0 bsp=e0000040438912a8
> [<a000000100011420>] show_regs+0x820/0x840
> sp=e000004043897b90 bsp=e000004043891260
> [<a000000100035990>] die+0x1d0/0x2e0
> sp=e000004043897b90 bsp=e000004043891218
> [<a000000100754de0>] ia64_do_page_fault+0x8e0/0xa00
> sp=e000004043897bb0 bsp=e0000040438911b8
> [<a00000010000b880>] ia64_leave_kernel+0x0/0x280
> sp=e000004043897c60 bsp=e0000040438911b8
> [<a0000001003148e0>] memset+0x240/0x420
> sp=e000004043897e30 bsp=e0000040438911a0
> [<a00000020022f540>] open_rnic_toe+0x140/0x620 [iw_cxgb3]
> sp=e000004043897e30 bsp=e000004043891148
> [<a000000200120800>] t3c_register_client+0x140/0x1e0 [cxgb3c]
> sp=e000004043897e30 bsp=e000004043891118
> [<a0000002000884a0>] iwch_init_module+0xc0/0x100 [iw_cxgb3]
> sp=e000004043897e30 bsp=e000004043891100
> [<a0000001000c1eb0>] sys_init_module+0x250/0x520
> sp=e000004043897e30 bsp=e000004043891088
> [<a00000010000b6e0>] ia64_ret_from_syscall+0x0/0x20
> sp=e000004043897e30 bsp=e000004043891088
> [<a000000000010640>] ia64_ivt+0xffffffff00010640/0x400
> sp=e000004043898000 bsp=e000004043891088
> BUG: modprobe/3348, lock held at task exit time!
> [a000000200129440] {t3cdev_db_lock}
> .. held by: modprobe: 3348 [e000004043890000, 116]
> ... acquired at: t3c_register_client+0x30/0x1e0 [cxgb3c]
>
>
More information about the general
mailing list