[openib-general] RE: [Sc05-ib] Opensm crash..

Jeff Sadowski jeff at abbatech.com
Fri Nov 18 10:40:34 PST 2005


Hey Hal maybe valgrind could be of some use?


-----Original Message-----
From: sc05-ib-bounces at lists.scl.ameslab.gov on behalf of Hal Rosenstock
Sent: Thu 11/17/2005 7:00 PM
To: troy at scl.ameslab.gov
Cc: sc05-ib at scl.ameslab.gov; openib-general at openib.org
Subject: Re: [Sc05-ib] Opensm crash..
 
On Wed, 2005-11-16 at 00:10, Troy Benjegerdes wrote:
> This was running with -maxsmps=32
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 98311 (LWP 31196)]
> 0xb7f71f29 in osm_log (p_log=0x0, verbosity=16 '\020',
>     p_str=0x80787cd "%s: [\n") at osm_log.c:137
> 137       if (p_log->level & verbosity)
> (gdb) bt
> #0  0xb7f71f29 in osm_log (p_log=0x0, verbosity=16 '\020',
>     p_str=0x80787cd "%s: [\n") at osm_log.c:137
> #1  0x0807755b in osm_vl15_poll (p_vl=0x8090ca4) at osm_vl15intf.c:410
> #2  0x0806b1fc in __osm_sm_mad_ctrl_update_wire_stats (p_ctrl=0x8090110)
>     at osm_sm_mad_ctrl.c:228
> #3  0x0806b6e0 in __osm_sm_mad_ctrl_rcv_callback (p_madw=0xb4c29390,
>     bind_context=0x8090110, p_req_madw=0x89cf1a8) at osm_sm_mad_ctrl.c:270
> #4  0xb7f3f821 in umad_receiver (p_ptr=0x80ccce8) at osm_vendor_ibumad.c:401
> #5  0xb7f6c617 in __cl_thread_wrapper (arg=0x0) at cl_thread.c:61
> #6  0x46d86ce1 in pthread_start_thread () from /lib/i686/libpthread.so.0
> #7  0x46d86e51 in pthread_start_thread_event () from
> /lib/i686/libpthread.so.0
> #8  0x46c16d3a in clone () from /lib/i686/libc.so.6
> (gdb) print p_log
> $1 = (osm_log_t * const) 0x0
> (gdb) up
> #1  0x0807755b in osm_vl15_poll (p_vl=0x8090ca4) at osm_vl15intf.c:410
> 410       OSM_LOG_ENTER( p_vl->p_log, osm_vl15_poll );
> (gdb) print p_vl
> $2 = (osm_vl15_t * const) 0x8090ca4
> (gdb) print p_vl->p_log
> $3 = (osm_log_t *) 0x0
> (gdb) print *p_vl
> $4 = {thread_state = OSM_THREAD_STATE_RUN, state = OSM_VL15_STATE_READY,
>   max_wire_smps = 32, signal = {condvar = {__c_lock = {__status = 0,
>         __spinlock = 0}, __c_waiting = 0x80a9940,
>       __padding = '\0' <repeats 27 times>, __align = 0}, signaled = 0,
>     manual_reset = 0, spinlock = {mutex = {__m_reserved = 0, __m_count = 0,
>         __m_owner = 0x0, __m_kind = 0, __m_lock = {__status = 0,
>           __spinlock = 0}}, state = CL_INITIALIZED}, state =
> CL_INITIALIZED},
>   poller = {osd = {id = 65541, state = CL_INITIALIZED},
>     pfn_callback = 0x8076d6c <__osm_vl15_poller>, context = 0x8090ca4,
>     name = '\0' <repeats 15 times>}, rfifo = {end = {p_next = 0xab44b760,
>       p_prev = 0x8090c6c}, count = 135010200, state = 0}, ufifo = {end = {
>       p_next = 0x100, p_prev = 0x0}, count = 148497440, state = 135010200},
>   lock = {mutex = {__m_reserved = 0, __m_count = 0, __m_owner = 0x0,
>       __m_kind = 0, __m_lock = {__status = 0, __spinlock = 0}}, state = 0},
>   p_vend = 0x0, p_log = 0x0, p_stats = 0x0, p_subn = 0x0, h_disp = 0x0,
>   p_lock = 0x0}
> (gdb) up
> #2  0x0806b1fc in __osm_sm_mad_ctrl_update_wire_stats (p_ctrl=0x8090110)
>     at osm_sm_mad_ctrl.c:228
> 228       osm_vl15_poll( p_ctrl->p_vl15 );
>

This looks like another memory scribbling issue. This time p_log was
cleared.

-- Hal



_______________________________________________
Sc05-ib mailing list
Sc05-ib at lists.scl.ameslab.gov
https://lists.scl.ameslab.gov/cgi-bin/mailman/listinfo/sc05-ib




More information about the general mailing list