[openib-general] RE: [Sc05-ib] Opensm crash..
Jeff Sadowski
jeff at abbatech.com
Fri Nov 18 10:40:34 PST 2005
Hey Hal maybe valgrind could be of some use?
-----Original Message-----
From: sc05-ib-bounces at lists.scl.ameslab.gov on behalf of Hal Rosenstock
Sent: Thu 11/17/2005 7:00 PM
To: troy at scl.ameslab.gov
Cc: sc05-ib at scl.ameslab.gov; openib-general at openib.org
Subject: Re: [Sc05-ib] Opensm crash..
On Wed, 2005-11-16 at 00:10, Troy Benjegerdes wrote:
> This was running with -maxsmps=32
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 98311 (LWP 31196)]
> 0xb7f71f29 in osm_log (p_log=0x0, verbosity=16 '\020',
> p_str=0x80787cd "%s: [\n") at osm_log.c:137
> 137 if (p_log->level & verbosity)
> (gdb) bt
> #0 0xb7f71f29 in osm_log (p_log=0x0, verbosity=16 '\020',
> p_str=0x80787cd "%s: [\n") at osm_log.c:137
> #1 0x0807755b in osm_vl15_poll (p_vl=0x8090ca4) at osm_vl15intf.c:410
> #2 0x0806b1fc in __osm_sm_mad_ctrl_update_wire_stats (p_ctrl=0x8090110)
> at osm_sm_mad_ctrl.c:228
> #3 0x0806b6e0 in __osm_sm_mad_ctrl_rcv_callback (p_madw=0xb4c29390,
> bind_context=0x8090110, p_req_madw=0x89cf1a8) at osm_sm_mad_ctrl.c:270
> #4 0xb7f3f821 in umad_receiver (p_ptr=0x80ccce8) at osm_vendor_ibumad.c:401
> #5 0xb7f6c617 in __cl_thread_wrapper (arg=0x0) at cl_thread.c:61
> #6 0x46d86ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #7 0x46d86e51 in pthread_start_thread_event () from
> /lib/i686/libpthread.so.0
> #8 0x46c16d3a in clone () from /lib/i686/libc.so.6
> (gdb) print p_log
> $1 = (osm_log_t * const) 0x0
> (gdb) up
> #1 0x0807755b in osm_vl15_poll (p_vl=0x8090ca4) at osm_vl15intf.c:410
> 410 OSM_LOG_ENTER( p_vl->p_log, osm_vl15_poll );
> (gdb) print p_vl
> $2 = (osm_vl15_t * const) 0x8090ca4
> (gdb) print p_vl->p_log
> $3 = (osm_log_t *) 0x0
> (gdb) print *p_vl
> $4 = {thread_state = OSM_THREAD_STATE_RUN, state = OSM_VL15_STATE_READY,
> max_wire_smps = 32, signal = {condvar = {__c_lock = {__status = 0,
> __spinlock = 0}, __c_waiting = 0x80a9940,
> __padding = '\0' <repeats 27 times>, __align = 0}, signaled = 0,
> manual_reset = 0, spinlock = {mutex = {__m_reserved = 0, __m_count = 0,
> __m_owner = 0x0, __m_kind = 0, __m_lock = {__status = 0,
> __spinlock = 0}}, state = CL_INITIALIZED}, state =
> CL_INITIALIZED},
> poller = {osd = {id = 65541, state = CL_INITIALIZED},
> pfn_callback = 0x8076d6c <__osm_vl15_poller>, context = 0x8090ca4,
> name = '\0' <repeats 15 times>}, rfifo = {end = {p_next = 0xab44b760,
> p_prev = 0x8090c6c}, count = 135010200, state = 0}, ufifo = {end = {
> p_next = 0x100, p_prev = 0x0}, count = 148497440, state = 135010200},
> lock = {mutex = {__m_reserved = 0, __m_count = 0, __m_owner = 0x0,
> __m_kind = 0, __m_lock = {__status = 0, __spinlock = 0}}, state = 0},
> p_vend = 0x0, p_log = 0x0, p_stats = 0x0, p_subn = 0x0, h_disp = 0x0,
> p_lock = 0x0}
> (gdb) up
> #2 0x0806b1fc in __osm_sm_mad_ctrl_update_wire_stats (p_ctrl=0x8090110)
> at osm_sm_mad_ctrl.c:228
> 228 osm_vl15_poll( p_ctrl->p_vl15 );
>
This looks like another memory scribbling issue. This time p_log was
cleared.
-- Hal
_______________________________________________
Sc05-ib mailing list
Sc05-ib at lists.scl.ameslab.gov
https://lists.scl.ameslab.gov/cgi-bin/mailman/listinfo/sc05-ib
More information about the general
mailing list