[ofa-general] SDP kernel Oops.. (OFED-1.2)

Paul Nowoczynski pauln at psc.edu
Mon Jul 30 13:26:49 PDT 2007


Jim,
I just ran with 1.2 and hit the same bug.  I've included the debug msgs 
leading up to the oops (at the bottom).  I think the problem has to do 
with handling a connection request after a socket has been destroyed.  
The failed instance of sdp_connect_handler() doesn't appear to run 
sdp_init_qp() so I assume that it fails somewhere before that.

I wonder if event->param.conn.private_data is bogus?

Thanks for your help.
Paul


int sdp_connect_handler(struct sock *sk, struct rdma_cm_id *id,
                        struct rdma_cm_event *event)
{
        struct sockaddr_in *dst_addr;
        struct sock *child;
        const struct sdp_hh *h;
        int rc;

        sdp_dbg(sk, "%s %p -> %p\n", __func__, sdp_sk(sk)->id, id);

        h = event->param.conn.private_data;

        if (!h->max_adverts)
                return -EINVAL;

        child = sk_clone(sk, GFP_KERNEL);
        if (!child)
                return -ENOMEM;

        sdp_add_sock(sdp_sk(child));
        INIT_LIST_HEAD(&sdp_sk(child)->accept_queue);
        INIT_LIST_HEAD(&sdp_sk(child)->backlog_queue);
        INIT_DELAYED_WORK(&sdp_sk(child)->time_wait_work, 
sdp_time_wait_work);
        INIT_WORK(&sdp_sk(child)->destroy_work, sdp_destroy_work);

        dst_addr = (struct sockaddr_in *)&id->route.addr.dst_addr;
        inet_sk(child)->dport = dst_addr->sin_port;
        inet_sk(child)->daddr = dst_addr->sin_addr.s_addr;

        bh_unlock_sock(child);
        __sock_put(child);

        rc = sdp_init_qp(child, id);
...

################# Console Msgs ###########################################

oss08p login:
Fedora Core release 3 (Heidelberg)
Kernel 2.6.9-42.0.8.EL_lustre.1.4.9.1customOBost_0.4 on an x86_64

oss08p login: Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): 
sdp_cma_handler event 4 id 000001015ed2b600
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): 
RDMA_CM_EVENT_CONNECT_REQUEST
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_connect_handler 
000001015ed30c00 -> 000001015ed2b600
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): sdp_init_qp
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): sdp_init_qp done
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): sdp_connect_handler 
bufs 64 xmit_size_goal 32768
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept state 10 
expected 10 *err -22
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_cma_handler event 4 
handled
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): event 4 done. status 0
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept: error -11
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept: status -11 
sk 000001014e790780 newsk 0000000000000000
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept state 10 
expected 10 *err -22
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept: error -11
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept: status -11 
sk 000001014e790780 newsk 0000000000000000
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): sdp_cma_handler event 
9 id 000001015ed2b600
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): RDMA_CM_EVENT_ESTABLISHED
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): sdp_connected_handler
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_connected_handler 
child connection established
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): sdp_cma_handler event 
9 handled
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): event 9 done. status 0
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept state 10 
expected 10 *err -22
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): sdp_accept: 
ib_req_notify_cq
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept: status -22 
sk 000001014e790780 newsk 000001014e790040
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept state 10 
expected 10 *err -22
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept: error -11
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:0): sdp_accept: status -11 
sk 000001014e790780 newsk 0000000000000000
Jul 30 16:08:22 oss08p kernel: sdp_sock(988:1023): sdp_setsockopt

Fedora Core release 3 (Heidelberg)
Kernel 2.6.9-42.0.8.EL_lustre.1.4.9.1customOBost_0.4 on an x86_64

oss08p login:
Fedora Core release 3 (Heidelberg)
Kernel 2.6.9-42.0.8.EL_lustre.1.4.9.1customOBost_0.4 on an x86_64

Fedora Core release 3 (Heidelberg)
Kernel 2.6.9-42.0.8.EL_lustre.1.4.9.1customOBost_0.4 on an x86_64

oss08p login:
Fedora Core release 3 (Heidelberg)
Kernel 2.6.9-42.0.8.EL_lustre.1.4.9.1customOBost_0.4 on an x86_64

oss08p login: Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): 
sdp_cma_handler event 4 id 000001015ee0c800
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): 
RDMA_CM_EVENT_CONNECT_REQUEST
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_connect_handler 
000001015ed30c00 -> 000001015ee0c800
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_init_qp
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_init_qp done
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_connect_handler 
bufs 64 xmit_size_goal 32768
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept state 10 
expected 10 *err -22
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_cma_handler event 4 
handled
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): event 4 done. status 0
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept: error -11
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept: status -11 
sk 000001014e790780 newsk 0000000000000000
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept state 10 
expected 10 *err -22
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept: error -11
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept: status -11 
sk 000001014e790780 newsk 0000000000000000
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_cma_handler 
event 9 id 000001015ee0c800
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): 
RDMA_CM_EVENT_ESTABLISHED
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_connected_handler
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_connected_handler 
child connection established
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_cma_handler 
event 9 handled
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): event 9 done. status 0
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept state 10 
expected 10 *err -22
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_accept: 
ib_req_notify_cq
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept: status -22 
sk 000001014e790780 newsk 0000010151c457c0
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept state 10 
expected 10 *err -22
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept: error -11
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:0): sdp_accept: status -11 
sk 000001014e790780 newsk 0000000000000000
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_setsockopt
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_fin
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_close
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_close: entering 
time wait refcnt 2
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_close: last 
socket put 2
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_unhash
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_handle_wc: 
destroy in time wait state
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_cma_handler 
event 10 id 000001015ee0c800
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): 
RDMA_CM_EVENT_DISCONNECTED
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_destroy_work: 
refcnt 1
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_destruct
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_close_sk
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_disconnected_handler
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_cma_handler 
event 10 handled
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_reset_sk
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): event 10 done. 
status -104
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_close_sk done; 
releasing sock
Jul 30 16:09:40 oss08p kernel: sdp_sock(988:32768): sdp_destruct done

Fedora Core release 3 (Heidelberg)
Kernel 2.6.9-42.0.8.EL_lustre.1.4.9.1customOBost_0.4 on an x86_64

oss08p login:
Fedora Core release 3 (Heidelberg)
Kernel 2.6.9-42.0.8.EL_lustre.1.4.9.1customOBost_0.4 on an x86_64

oss08p login:
Kernel BUG at sdp_cma:372
invalid operand: 0000 [1] SMP
CPU 1
Modules linked in: ksocklnd ptlrpc obdclass lvfs lnet libcfs ib_ipoib 
ib_srp ib_sdp rdma_cm ib_addr iw_cm ib_local_sa ib_cm iptable_filter i
p_tables e1000 ib_sa ib_uverbs ib_umad ib_mthca ib_mad ib_core md forcedeth
Pid: 2362, comm: ib_cm/1 Not tainted 
2.6.9-42.0.8.EL_lustre.1.4.9.1customOBost_0.4
RIP: 0010:[<ffffffffa00abbbf>] 
<ffffffffa00abbbf>{:ib_sdp:sdp_connect_handler+207}
RSP: 0018:0000010155f21bc8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000010150e45800 RCX: 0000000000000000
RDX: 00000000ffffffff RSI: 0000010150e450a8 RDI: ffffffffa00b3640
RBP: ffffffffa00b3140 R08: 0000010155ef8ef8 R09: 0000010155ef8f08
R10: 00000000ffffffff R11: 0000000000000000 R12: 000001014e790780
R13: 0000010155f21cf8 R14: 0000000000000000 R15: 000001015681bfa4
FS:  0000002a9589db00(0000) GS:ffffffff805a3140(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000002a986adad8 CR3: 000000007ea38000 CR4: 00000000000006e0
Process ib_cm/1 (pid: 2362, threadinfo 0000010155f20000, task 
00000101565897e0)
Stack: 000001015ede2400 000001014e7909e0 000001014e790780 000001015ede2400
       0000010155f21cf8 0000000000000000 0000000000000000 ffffffffa00ac581
       000001015ede2400 000001015ede2458
Call Trace:<ffffffffa00ac581>{:ib_sdp:sdp_cma_handler+945} 
<ffffffffa00a0347>{:rdma_cm:cma_acquire_dev+359}
       <ffffffffa00a1488>{:rdma_cm:cma_req_handler+1000} 
<ffffffffa006859a>{:ib_cm:cm_process_work+26}
       <ffffffffa006901f>{:ib_cm:cm_req_handler+2463} 
<ffffffffa0069490>{:ib_cm:cm_work_handler+0}
       <ffffffffa00694d2>{:ib_cm:cm_work_handler+66} 
<ffffffff80133133>{__wake_up+67}
       <ffffffffa0069490>{:ib_cm:cm_work_handler+0} 
<ffffffff80149110>{worker_thread+496}
       <ffffffff80133070>{default_wake_function+0} 
<ffffffff801330c0>{__wake_up_common+64}
       <ffffffff80133070>{default_wake_function+0} 
<ffffffff8014d630>{keventd_create_kthread+0}
       <ffffffff80148f20>{worker_thread+0} 
<ffffffff8014d630>{keventd_create_kthread+0}
       <ffffffff8014d5e9>{kthread+217} <ffffffff8011144b>{child_rip+8}
       <ffffffff8014d630>{keventd_create_kthread+0} 
<ffffffff8014d510>{kthread+0}
       <ffffffff80111443>{child_rip+0}

Code: 0f 0b 01 f0 0a a0 ff ff ff ff 74 01 66 66 90 66 90 65 8b 04
RIP <ffffffffa00abbbf>{:ib_sdp:sdp_connect_handler+207} RSP 
<0000010155f21bc8>
 <0>Kernel panic Jul 30 16:10:11 -oss08p kernel: sdp_sock(988:0): 
sdp_cma_handler event 4 id 000001015ede2400
Jul 30 16:10:11 oss08p kernel: sdp_sock(988:0): 
RDMA_CM_EVENT_CONNECT_REQUEST
Jul 30 16:10:11 oss08p kernel: sdp_sock(988:0): sdp_connect_handler 
000001015ed30c00 -> 000001015ede 2400
Jul 30 16:10:11 oss08p kernel: ----------- [cut here ] --------- [please 
bite here ] ---------
Jul 30 16:10:12 oss08p kernel: Kernel BUG at sdp_cma:372
Jul 30 16:10:12 oss08p kernel: invalid operand: 0000 [1] SMP
Jul 30 16:10:12 oss08p kernel: CPU 1
Jul 30 16:n10:12 oss08p kernel: Modules linked in: ksocklnd ptlrpc 
obdclass lvfs lnet libcfs ib_ipoib ib_srp ib_sdp rdma_cm ib_addr iw_cm ib
_local_sa ib_cm iptable_filter ip_tables e1000 ib_sa ib_uverbs ib_umad 
ib_mthca ib_mad ib_core md forcedeth
Jul 30 16:10:12 osos08p kernel: Pid: 2362, comm: ib_cm/1 Not tainted 
2.6.9-42.0.8.EL_lustre.1.4.9.1customOBost_0.4
Jul 30 16:10:12 oss08p kernel: RIP: 0010:[<ffffffffa00abbbf>] 
<ffffffffa00abbbf>{:ib_sdp:sdp_connect_handler+207}
Jul 30 16:10:12 oss08p kernel: RSP: 0018:0000010155f21bc8  EtFLAGS: 00010246
Jul 30 16:10:12 oss08p kernel: RAX: 0000000000000000 RBX: 
0000010150e45800 RCX: 0000000000000000
Jul 30 16:10:12 oss08p kernel: RDX: 00000000ffffffff RSI: 
0000010150e450a8 RDI: ffffffffa00b3640
Jul 30 16:10:12 oss08p kernel: RBP: fffffff fa00b3140 R08: 
0000010155ef8ef8 R09: 0000010155ef8f08
Jul 30 16:10:12 oss08p kernel: R10: 00000000ffffffff R11: 
0000000000000000 R12: 000001014e790780
Jul 30 16:10:12 oss08p kernel: R13: 0000010155f21cf8 R14: 
0000000000000000 R15: 000001015681bfa4
Jul 30 16:10:12 oss08sp kernel: FS:  0000002a9589db00(0000) 
GS:ffffffff805a3140(0000) knlGS:0000000000000000
Jul 30 16:10:12 oss08p kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 
000000008005003b
Jul 30 16:10:12 oss08p kernel: CR2: 0000002a986adad8 CR3: 
000000007ea38000 CR4: 00000000000006e0
Jul y30 16:10:12 oss08p kernel: Process ib_cm/1 (pid: 2362, threadinfo 
0000010155f20000, task 00000101565897e0)
Jul 30 16:10:12 oss08p kernel: Stack: 000001015ede2400 000001014e7909e0 
000001014e790780 000001015ede2400
Jul 30 16:10:12 oss08p kernel:        00n00010155f21cf8 0000000000000000 
0000000000000000 ffffffffa00ac581
Jul 30 16:10:12 oss08p kernel:        000001015ede2400 000001015ede2458
Jul 30 16:10:12 oss08p kernel: Call 
Trace:<ffffffffa00ac581>{:ib_sdp:sdp_cma_handler+945} 
<ffffffffa00a0347>{:rdma_cm:cma_acquire_cdev+359}
Jul 30 16:10:12 oss08p kernel:        
<ffffffffa00a1488>{:rdma_cm:cma_req_handler+1000} 
<ffffffffa006859a>{:ib_cm:cm_process_work+26}
Jul 30 16:10:12 oss08p kernel:        
<ffffffffa006901f>{:ib_cm:cm_req_handler+2463} 
<ffffffffa0069490>{:ib_cim:cm_work_handler+0}
Jul 30 16:10:12 oss08p kernel:        
<ffffffffa00694d2>{:ib_cm:cm_work_handler+66} 
<ffffffff80133133>{__wake_up+67}
Jul 30 16:10:12 oss08p kernel:        
<ffffffffa0069490>{:ib_cm:cm_work_handler+0} 
<ffffffff80149110>{worker_thread+496}
Jul 30 n16:10:12 oss08p kernel:        
<ffffffff80133070>{default_wake_function+0} 
<ffffffff801330c0>{__wake_up_common+64}
Jul 30 16:10:12 oss08p kernel:        
<ffffffff80133070>{default_wake_function+0} 
<ffffffff8014d630>{keventd_create_kthread+0}
Jul 30 16:g10:12 oss08p kernel:        
<ffffffff80148f20>{worker_thread+0} 
<ffffffff8014d630>{keventd_create_kthread+0}
Jul 30 16:10:12 oss08p kernel:        <ffffffff8014d5e9>{kthread+217} 
<ffffffff8011144b>{child_rip+8}
Jul 30 16:10:12 oss08p kernel:        
<ffffffff8014d630>{:keventd_create_kthread+0} <ffffffff8014d510>{kthread+0}
Jul 30 16:10:12 oss08p kernel:        <ffffffff80111443>{child_rip+0}
Jul 30 16:10:12 oss08p kernel:
Jul 30 16:10:12 oss08p kernel: Code: 0f 0b 01 f0 0a a0 ff ff ff ff 74 01 
66 66 90 66 90 65 8b  04
Jul 30 16:10:12 oss08p kernel: RIP 
<ffffffffa00abbbf>{:ib_sdp:sdp_connect_handler+207} RSP <0000010155f21bc8>
Jul 30 16:10:12 oss08p kernel:  <0>Kernel panic - not syncing: Oops
Oops




More information about the general mailing list