[openib-general] nightly osm_sim report 2006-12-30:normal completion

Hal Rosenstock halr at voltaire.com
Sat Dec 30 14:25:00 PST 2006


On Sat, 2006-12-30 at 16:33, Eitan Zahavi wrote:
> Hal Rosenstock wrote:
> > Hi Eitan,
> >
> > On Sat, 2006-12-30 at 00:25, Eitan Zahavi wrote:
> >   
> >> OSM Simulation Regression Summary
> >> OpenSM rev = Fri_Dec_29_12:19:08_2006 2e0f81 
> >> ibutils rev = Wed_Dec_27_23:39:30_2006 60aebe 
> >> Total=405 Pass=330 Fail=75
> >>
> >> Pass:
> >> 45 Stability IS1-16.topo
> >> 45 Pkey IS1-16.topo
> >> 45 OsmStress IS1-16.topo
> >> 45 Multicast IS1-16.topo
> >> 45 LidMgr IS1-16.topo
> >> 15 Stability IS3-loop.topo
> >> 15 Stability IS3-128.topo
> >> 15 Pkey IS3-128.topo
> >> 15 OsmStress IS3-128.topo
> >> 15 Multicast IS3-loop.topo
> >> 15 Multicast IS3-128.topo
> >> 15 LidMgr IS3-128.topo
> >>
> >> Failures:
> >> 45 OsmTest IS1-16.topo
> >> 15 OsmTest IS3-loop.topo
> >> 15 OsmTest IS3-128.topo
> >>     
> >
> > Any idea on these osmtest failures ? I did add SA MFTRecord yesterday
> > and made a change to SA LFTRecord and SwitchInfoRecord the day before as
> > well as additional osmtests for MFTRecord and LFTRecord.
> >   
> Actually I get a core dump:

Thanks for providing this!

> #0  0x0805c265 in osm_mcast_tbl_get_block (p_tbl=0x8f6ef6c, 
> block_num=-32575, position=0 '\0', p_block=0xb19e4d2c)
>     at osm_mcast_tbl.c:299
> 299         p_block[i] = (*p_tbl->p_mask_tbl)[mlid_start_ho + i][position];
> 
> (gdb) p i
> $1 = 2
> (gdb) p mlid_start_ho
> $2 = 6176
> (gdb) p position
> $3 = 0 '\0'
> (gdb) where
> #0  0x0805c265 in osm_mcast_tbl_get_block (p_tbl=0x8f6ef6c, 
> block_num=-32575, position=0 '\0', p_block=0xb19e4d2c)
>     at osm_mcast_tbl.c:299
> #1  0x08073d29 in osm_switch_get_mft_block (p_sw=0x8f6eed8, 
> block_num=32961, position=0 '\0', p_block=0xb19e4d2c)
>     at ./../include/opensm/osm_switch.h:1074
> #2  0x08073b8c in __osm_mftr_rcv_new_mftr (p_rcv=0x80e9a6c, 
> p_sw=0x8f6eed8, p_list=0xb61c0370, lid=512, block=32961,
                                                    ^^^^^
max block number is 511 so this is what caused the core dump.
I just checked in a patch for this which should work.

-- Hal

>     position=0 '\0') at osm_sa_mft_record.c:181
> #3  0x08074273 in __osm_mftr_rcv_by_comp_mask (p_map_item=0x8f6eed8, 
> context=0xb61c0330) at osm_sa_mft_record.c:317
> #4  0x00cd9747 in cl_qmap_apply_func (p_map=0x80e8584, 
> pfn_func=0x8073f98 <__osm_mftr_rcv_by_comp_mask>, context=0xb61c0330)
>     at cl_map.c:287
> #5  0x08074653 in osm_mftr_rcv_process (p_rcv=0x80e9a6c, 
> p_madw=0x8f29f0c) at osm_sa_mft_record.c:390
> #6  0x08074ef2 in __osm_mftr_rcv_ctrl_disp_callback (context=0x80e9afc, 
> p_data=0x8f29f0c) at osm_sa_mft_record_ctrl.c:63
> #7  0x00cd3d4f in __cl_disp_worker (context=0x80e9d18) at 
> cl_dispatcher.c:102
> #8  0x00ce1297 in __cl_thread_pool_routine (context=0x80e9d5c) at 
> cl_threadpool.c:74
> #9  0x00ce0f61 in __cl_thread_wrapper (arg=0x8f1c690) at cl_thread.c:58
> #10 0x00361371 in start_thread () from /lib/tls/libpthread.so.0
> #11 0x001eaffe in clone () from /lib/tls/libc.so.6
> 
> 
> > Also, why are osmtest failures allowed for "normal completion" ?
> >
> > -- Hal
> >
> >
> >
> > _______________________________________________
> > openib-general mailing list
> > openib-general at openib.org
> > http://openib.org/mailman/listinfo/openib-general
> >
> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> >   
> 





More information about the general mailing list