[openib-general] wiki update - howto for Chelsio's T3 RNIC

Steve Wise swise at opengridcomputing.com
Fri Jul 14 06:54:12 PDT 2006


Erf.  I see the bug.  SCE (stewpid coding error :).  The patch below
will solve the crash, but the issue really is that we're failing to
allocate our data structures.  To work around this for now, I suggest
you patch the cxgb3 src code to reduce the number of supported objects
in core/cxio_hal.h.  Namely T3_MAX_NUM_RI, T3_MAX_NUM_QP, etc... 



For the crash try this:

----
Index: iwch.c
===================================================================
--- iwch.c      (revision 8481)
+++ iwch.c      (working copy)
@@ -65,7 +65,8 @@
 static inline void *vzmalloc(int size)
 {
        void *p = vmalloc(size);
-       memset(p, 0, size);
+       if (p)
+               memset(p, 0, size);
        return p;
 }



Steve.



> [root at iblp0044 ~]# modprobe iw_cxgb3
> Segmentation fault
> 
> [root at iblp0044 ~]# dmesg
> ...
> <snip>
> ...
> eth2: Chelsio T320 2x10000BaseX RNIC (rev 0) PCI-X 133MHz/64-bit MSI-X
> eth2: 128MB CM, 256MB PMTX, 256MB PMRX
> eth3: Chelsio T320 2x10000BaseX RNIC (rev 0) PCI-X 133MHz/64-bit MSI-X
> Unable to handle kernel paging request at virtual address 0000000000248000
> modprobe[3348]: Oops 8804682956800 [1]
> Modules linked in: iw_cxgb3 cxgb3c ib_umad ib_ucm ib_uverbs ib_sa ib_cm ib_mad ib_core cxgb3
> 
> Pid: 3348, CPU 0, comm:             modprobe
> psr : 00001010081a6018 ifs : 8000000000000183 ip  : [<a0000001003148e0>]    Not tainted
> ip is at memset+0x240/0x420
> unat: 0000000000000000 pfs : 0000000000000593 rsc : 0000000000000003
> rnat: 0000002080000000 bsps: ffffffff80000000 pr  : 0000000005550519
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
> csd : 0000000000000000 ssd : 0000000000000000
> b0  : a00000020022f540 b6  : a0000002000d44e0 b7  : a0000002000b9f00
> f6  : 1003e0000000000000000 f7  : 1003e6db6db6db6db6db7
> f8  : 1003e00000000071b5ed6 f9  : 1003e0000000000000000
> f10 : 1003e0000000000020000 f11 : 1003e0000000000000000
> r1  : a000000100c80920 r2  : e0000040f98034f8 r3  : a000000100a99f98
> r8  : 0000000000248000 r9  : 6db6db6db6db6db7 r10 : 00000000071b5ed6
> r11 : 0000000038daf6b0 r12 : e000004043897e30 r13 : e000004043890000
> r14 : e0000040f98034f8 r15 : 00000040fa3e8000 r16 : a000000200248000
> r17 : a0007fffc7200000 r18 : a000000100a9c220 r19 : a000000100a9c220
> r20 : 00000040fffb8000 r21 : 0000000000000010 r22 : 0000000000000800
> r23 : 0000000000000007 r24 : 0000000000248000 r25 : 00000040fc020000
> r26 : a000000202248000 r27 : 0000000000248010 r28 : 0000000000288000
> r29 : e0000040fc020000 r30 : 0000000000000000 r31 : 00000000000007ff
> 
> Call Trace:
>   [<a000000100010b50>] show_stack+0x50/0xa0
>                                  sp=e0000040438979c0 bsp=e0000040438912a8
>   [<a000000100011420>] show_regs+0x820/0x840
>                                  sp=e000004043897b90 bsp=e000004043891260
>   [<a000000100035990>] die+0x1d0/0x2e0
>                                  sp=e000004043897b90 bsp=e000004043891218
>   [<a000000100754de0>] ia64_do_page_fault+0x8e0/0xa00
>                                  sp=e000004043897bb0 bsp=e0000040438911b8
>   [<a00000010000b880>] ia64_leave_kernel+0x0/0x280
>                                  sp=e000004043897c60 bsp=e0000040438911b8
>   [<a0000001003148e0>] memset+0x240/0x420
>                                  sp=e000004043897e30 bsp=e0000040438911a0
>   [<a00000020022f540>] open_rnic_toe+0x140/0x620 [iw_cxgb3]
>                                  sp=e000004043897e30 bsp=e000004043891148
>   [<a000000200120800>] t3c_register_client+0x140/0x1e0 [cxgb3c]
>                                  sp=e000004043897e30 bsp=e000004043891118
>   [<a0000002000884a0>] iwch_init_module+0xc0/0x100 [iw_cxgb3]
>                                  sp=e000004043897e30 bsp=e000004043891100
>   [<a0000001000c1eb0>] sys_init_module+0x250/0x520
>                                  sp=e000004043897e30 bsp=e000004043891088
>   [<a00000010000b6e0>] ia64_ret_from_syscall+0x0/0x20
>                                  sp=e000004043897e30 bsp=e000004043891088
>   [<a000000000010640>] ia64_ivt+0xffffffff00010640/0x400
>                                  sp=e000004043898000 bsp=e000004043891088
>   BUG: modprobe/3348, lock held at task exit time!
>   [a000000200129440] {t3cdev_db_lock}
> .. held by:          modprobe: 3348 [e000004043890000, 116]
> ... acquired at:               t3c_register_client+0x30/0x1e0 [cxgb3c]
> 
> 





More information about the general mailing list