[ofa-general] [Bug 662]
Steve Wise
swise at opengridcomputing.com
Tue Jun 26 08:02:30 PDT 2007
I think the bug is in rping_bind_client(). If addr resolution fails via
a ADDR_ERROR event, then rping_bind_client() wakes up and mistakenly
returns variable 'ret' which is zero. It should return non-zero in this
case.
Steve.
Michael S. Tsirkin wrote:
>> Quoting Michael S. Tsirkin <mst at dev.mellanox.co.il>:
>> Subject: bug 667
>>
>> Sean, could you look at bug 667 please?
>> rping seems to be crashing after connect error.
>
> Here's a backtrace from the core dump.
>
> # rping -c -d -a 11.4.3.174
> ipaddr (11.4.3.174)
> libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
> This will severely limit memory registrations.
> created cm_id 0x505f10
> cma_event type 1 cma_id 0x505f10 (parent)
> cma event 1, error -110
> waiting for addr/route resolution state 1
> Segmentation fault (core dumped)
> # gdb `which rping`
> GNU gdb 6.4
> Copyright 2005 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "x86_64-suse-linux"...Using host libthread_db library "/lib64/libthread_db.so.1".
>
> (gdb) core core.29968
> Core was generated by `rping -c -d -a 11.4.3.174'.
> Program terminated with signal 11, Segmentation fault.
> Reading symbols from /usr/local/ofed/lib64/librdmacm.so.1...done.
> Loaded symbols for /usr/local/ofed/lib64/librdmacm.so.1
> Reading symbols from /usr/local/ofed/lib64/libibverbs.so.1...done.
> Loaded symbols for /usr/local/ofed/lib64/libibverbs.so.1
> Reading symbols from /lib64/libpthread.so.0...done.
> Loaded symbols for /lib64/libpthread.so.0
> Reading symbols from /lib64/libdl.so.2...done.
> Loaded symbols for /lib64/libdl.so.2
> Reading symbols from /lib64/libc.so.6...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Reading symbols from /usr/local/ofed/lib64/libcxgb3-rdmav2.so...done.
> Loaded symbols for /usr/local/ofed/lib64/libcxgb3-rdmav2.so
> Reading symbols from /usr/local/ofed/lib64/libmthca-rdmav2.so...done.
> Loaded symbols for /usr/local/ofed/lib64/libmthca-rdmav2.so
> #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:143
> 143 pd = context->ops.alloc_pd(context);
> (gdb) where
> #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:143
> #1 0x00000000004015e6 in rping_setup_qp (cb=0x505010, cm_id=0x505f10)
> at examples/rping.c:514
> #2 0x000000000040270b in main (argc=5, argv=0x7fffe0117238) at examples/rping.c:936
> (gdb) frame 1
> #1 0x00000000004015e6 in rping_setup_qp (cb=0x505010, cm_id=0x505f10)
> at examples/rping.c:514
> 514 cb->pd = ibv_alloc_pd(cm_id->verbs);
> (gdb) p cm_id->verbs
> $1 = (struct ibv_context *) 0x0
> (gdb) p (struct cma_id_private *)cm_id
> $2 = (struct cma_id_private *) 0x505f10
> (gdb) p *$2
> $3 = {id = {verbs = 0x0, channel = 0x505ef0, context = 0x505010, qp = 0x0, route = {
> addr = {src_addr = {sa_family = 0, sa_data = '\0' <repeats 13 times>},
> src_pad = '\0' <repeats 111 times>, dst_addr = {sa_family = 2,
> sa_data = "\000\000\v\004\003�\000\000\000\000\000\000\000"},
> dst_pad = '\0' <repeats 111 times>, addr = {ibaddr = {sgid = {
> raw = '\0' <repeats 15 times>, global = {subnet_prefix = 0,
> interface_id = 0}}, dgid = {raw = '\0' <repeats 15 times>, global = {
> subnet_prefix = 0, interface_id = 0}}, pkey = 0}}}, path_rec = 0x0,
> num_paths = 0}, ps = RDMA_PS_TCP, port_num = 0 '\0'}, cma_dev = 0x0,
> events_completed = 0, connect_error = 0, cond = {__data = {__lock = 0, __futex = 0,
> __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0,
> __broadcast_seq = 0}, __size = '\0' <repeats 47 times>, __align = 0}, mut = {
> __data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0,
> __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
> __size = '\0' <repeats 39 times>, __align = 0}, handle = 0, mc_list = 0x0}
> (gdb) where
> #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:143
> #1 0x00000000004015e6 in rping_setup_qp (cb=0x505010, cm_id=0x505f10)
> at examples/rping.c:514
> #2 0x000000000040270b in main (argc=5, argv=0x7fffe0117238) at examples/rping.c:936
> (gdb)
>
>
More information about the general
mailing list