[openib-general] opensm segfaults after CL_INSUFFICIENT_MEMORY

Hal Rosenstock halr at voltaire.com
Thu Feb 10 12:26:45 PST 2005


On Thu, 2005-02-10 at 14:44, Bernhard Fischer wrote:
> Hi,
> 
> I'm seeing the segfault below when i try to run opensm.
> 
> gen2 as of 09.02.2005, only thing i changed in order to try to
> circumvent those
> ib_mthca 0000:04:00.0: CQ overrun on CQN 00000082
> was setting IPOIB_NUM_WC to 1 as noted here:
> http://openib.org/pipermail/openib-general/2004-December/007147.html
> (which didn't help).

This is totally separate.

> Any ideas?
> (gdb) run
> Starting program: /usr/local/ib/bin/opensm 
> [Thread debugging using libthread_db enabled]
> [New Thread -1209392032 (LWP 7991)]
> __init: failed to create timer provider status (CL_INSUFFICIENT_MEMORY) 

First malloc is failing to create the timer

> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread -1209392032 (LWP 7991)]
> 0xb7f0b597 in memset () from /lib/tls/libc.so.6
> (gdb) bt
> #0  0xb7f0b597 in memset () from /lib/tls/libc.so.6
> #1  0xb7fcd000 in ?? ()
> #2  0xb7fd2e4a in cl_memset (p_memory=Variable "p_memory" is not
> available.
> ) at cl_memory_osd.c:78
> #3  0xb7fd488e in __cl_sys_callback_construct () at cl_memory.h:553
> #4  0xb7fd48d0 in __cl_sys_callback_init () at cl_syscallback.c:113
> #5  0xb7fd09b6 in complib_init () at cl_complib.c:106
> #6  0xb7fd5dc7 in __do_global_ctors_aux () from
> /usr/local/ib/lib/libosmcomp.so.1
> #7  0xb7fcfed5 in _init () from /usr/local/ib/lib/libosmcomp.so.1
> #8  0xb7ff716c in call_init () from /lib/ld-linux.so.2
> #9  0xb7ff7252 in _dl_init_internal () from /lib/ld-linux.so.2
> #10 0xb7fea9c5 in _dl_start_user () from /lib/ld-linux.so.2

Initialization continues (not sure it should) and attempting to clear a
static object fails. Even if initialization didn't continue, opensm
would not run properly.

The first problem is why the malloc fails for the timer provider. The
second problem is pretty strange as the system callback object is local
and should not be an address that causes a segmentation violation.

Is this built with the autotools version ?

Could you revert back to the old Makefiles and rebuild if so and see if
you still have the same problem ?

Thanks.

-- Hal

> Thank you,
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list