[openib-general] *** glibc detected *** corrupted double-linked list error

wei huang huanwei at cse.ohio-state.edu
Thu Jan 5 19:21:36 PST 2006


Hi Roland,

Sorry we were distracted by some other work so I did not respond your
email.

Yes, unfortunately I still see the problem.

I get the core dump, but I am not sure what exactly information is
helpful to you.

Anyway, here is some of the output from the core dump:
==================================================================
#x0-gen2# /home/4/huangwei/new_cache/mvapich2-new-cache/test/mpi/rma> ../../../bin/mpiexec -gdb -n 2 ./test2
0-1:  (gdb) core core.4576
0-1:  Core was generated by `./test2'.
0-1:  Program terminated with signal 6, Aborted.
0-1:  Reading symbols from /usr/local/lib/libibverbs.so.1...done.
0-1:  Loaded symbols for /usr/local/lib/libibverbs.so.1
0-1:  Reading symbols from /lib64/tls/libpthread.so.0...done.
0-1:  Loaded symbols for /lib64/tls/libpthread.so.0
0-1:  Reading symbols from /lib64/tls/librt.so.1...done.
0-1:  Loaded symbols for /lib64/tls/librt.so.1
0-1:  Reading symbols from /lib64/tls/libc.so.6...done.
0-1:  Loaded symbols for /lib64/tls/libc.so.6
0-1:  Reading symbols from /usr/lib64/libsysfs.so.1...done.
0-1:  Loaded symbols for /usr/lib64/libsysfs.so.1
0-1:  Reading symbols from /lib64/libdl.so.2...done.
0-1:  Loaded symbols for /lib64/libdl.so.2
0-1:  Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
0-1:  Loaded symbols for /lib64/ld-linux-x86-64.so.2
0-1:  Reading symbols from /usr/local/lib/infiniband/mthca.so...done.
0-1:  Loaded symbols for /usr/local/lib/infiniband/mthca.so
0-1:  Reading symbols from /lib64/libnss_files.so.2...done.
0-1:  Loaded symbols for /lib64/libnss_files.so.2
0:  #0  0x0000003dd9a2e4dd in ?? ()
1:  #0  0x0000003dd9a2e4dd in raise () from /lib64/tls/libc.so.6
0-1:  (gdb) where
0:  #0  0x0000003dd9a2e4dd in ?? ()
1:  #0  0x0000003dd9a2e4dd in raise () from /lib64/tls/libc.so.6
0:  #1  0x0000003dd9a2fc8e in ?? ()
1:  #1  0x0000003dd9a2fc8e in abort () from /lib64/tls/libc.so.6
0:  #2  0x0000000000000020 in ?? ()
1:  #2  0x0000003dd9a62b41 in __libc_message () from /lib64/tls/libc.so.6
0:  #3  0x0000000000000000 in ?? ()
1:  #3  0x0000003dd9a67da1 in malloc_consolidate () from
/lib64/tls/libc.so.6
0:  (gdb) 1:  #4  0x0000003dd9a684d6 in _int_free () from
/lib64/tls/libc.so.6
1:  #5  0x0000003dd9a68a06 in free () from /lib64/tls/libc.so.6
1:  #6  0x00002aaaaace7765 in mthca_free_db_tab (db_tab=0x5b88f0)
1:      at src/memfree.c:201
1:  #7  0x00002aaaaace79b3 in mthca_free_context (ibctx=0x5b6b50)
1:      at src/mthca.c:206
1:  #8  0x00002aaaaaaafce0 in ibv_close_device (context=0x5b6b50)
1:      at src/device.c:151
1:  #9  0x00000000004424ec in MPIDI_CH3I_RMDA_finalize () at
rdma_iba_init.c:897
1:  #10 0x000000000043e9c5 in MPIDI_CH3_Finalize () at ch3_finalize.c:43
1:  #11 0x0000000000422e10 in MPID_Finalize () at mpid_finalize.c:157
1:  #12 0x000000000040f536 in PMPI_Finalize () at finalize.c:145
1:  #13 0x00000000004035a2 in main (argc=1, argv=0x7fffffc86558) at test2.c:74

Would you please let me know how I can provide more valuable information?

Thanks.

Regards,
Wei Huang

774 Dreese Lab, 2015 Neil Ave,
Dept. of Computer Science and Engineering
Ohio State University
OH 43210
Tel: (614)292-8501


On Wed, 4 Jan 2006, Roland Dreier wrote:

>     wei> Hi, We encountered the following error when we call
>     wei> ibv_close_device: *** glibc detected *** corrupted
>     wei> double-linked list: 0x0000000000a54e10 ***
>
> Any further information on this?  Are you still seeing the problem?
>
>  - R.
>




More information about the general mailing list