[openib-general] nightly osm_sim report 2006-12-30:normal completion

Hal Rosenstock halr at voltaire.com
Mon Jan 1 07:13:05 PST 2007


On Sat, 2006-12-30 at 17:25, Hal Rosenstock wrote:
> On Sat, 2006-12-30 at 16:33, Eitan Zahavi wrote:
> > Hal Rosenstock wrote:
> > > Hi Eitan,
> > >
> > > On Sat, 2006-12-30 at 00:25, Eitan Zahavi wrote:
> > >   
> > >> OSM Simulation Regression Summary
> > >> OpenSM rev = Fri_Dec_29_12:19:08_2006 2e0f81 
> > >> ibutils rev = Wed_Dec_27_23:39:30_2006 60aebe 
> > >> Total=405 Pass=330 Fail=75
> > >>
> > >> Pass:
> > >> 45 Stability IS1-16.topo
> > >> 45 Pkey IS1-16.topo
> > >> 45 OsmStress IS1-16.topo
> > >> 45 Multicast IS1-16.topo
> > >> 45 LidMgr IS1-16.topo
> > >> 15 Stability IS3-loop.topo
> > >> 15 Stability IS3-128.topo
> > >> 15 Pkey IS3-128.topo
> > >> 15 OsmStress IS3-128.topo
> > >> 15 Multicast IS3-loop.topo
> > >> 15 Multicast IS3-128.topo
> > >> 15 LidMgr IS3-128.topo
> > >>
> > >> Failures:
> > >> 45 OsmTest IS1-16.topo
> > >> 15 OsmTest IS3-loop.topo
> > >> 15 OsmTest IS3-128.topo
> > >>     
> > >
> > > Any idea on these osmtest failures ? I did add SA MFTRecord yesterday
> > > and made a change to SA LFTRecord and SwitchInfoRecord the day before as
> > > well as additional osmtests for MFTRecord and LFTRecord.
> > >   
> > Actually I get a core dump:
> 
> Thanks for providing this!
> 
> > #0  0x0805c265 in osm_mcast_tbl_get_block (p_tbl=0x8f6ef6c, 
> > block_num=-32575, position=0 '\0', p_block=0xb19e4d2c)
> >     at osm_mcast_tbl.c:299
> > 299         p_block[i] = (*p_tbl->p_mask_tbl)[mlid_start_ho + i][position];
> > 
> > (gdb) p i
> > $1 = 2
> > (gdb) p mlid_start_ho
> > $2 = 6176
> > (gdb) p position
> > $3 = 0 '\0'
> > (gdb) where
> > #0  0x0805c265 in osm_mcast_tbl_get_block (p_tbl=0x8f6ef6c, 
> > block_num=-32575, position=0 '\0', p_block=0xb19e4d2c)
> >     at osm_mcast_tbl.c:299
> > #1  0x08073d29 in osm_switch_get_mft_block (p_sw=0x8f6eed8, 
> > block_num=32961, position=0 '\0', p_block=0xb19e4d2c)
> >     at ./../include/opensm/osm_switch.h:1074
> > #2  0x08073b8c in __osm_mftr_rcv_new_mftr (p_rcv=0x80e9a6c, 
> > p_sw=0x8f6eed8, p_list=0xb61c0370, lid=512, block=32961,
>                                                     ^^^^^
> max block number is 511 so this is what caused the core dump.
> I just checked in a patch for this which should work.

It didn't work. 

Can you dump p_sw->mcast_tbl ?

Thanks.

-- Hal

> 
> -- Hal
> 
> >     position=0 '\0') at osm_sa_mft_record.c:181
> > #3  0x08074273 in __osm_mftr_rcv_by_comp_mask (p_map_item=0x8f6eed8, 
> > context=0xb61c0330) at osm_sa_mft_record.c:317
> > #4  0x00cd9747 in cl_qmap_apply_func (p_map=0x80e8584, 
> > pfn_func=0x8073f98 <__osm_mftr_rcv_by_comp_mask>, context=0xb61c0330)
> >     at cl_map.c:287
> > #5  0x08074653 in osm_mftr_rcv_process (p_rcv=0x80e9a6c, 
> > p_madw=0x8f29f0c) at osm_sa_mft_record.c:390
> > #6  0x08074ef2 in __osm_mftr_rcv_ctrl_disp_callback (context=0x80e9afc, 
> > p_data=0x8f29f0c) at osm_sa_mft_record_ctrl.c:63
> > #7  0x00cd3d4f in __cl_disp_worker (context=0x80e9d18) at 
> > cl_dispatcher.c:102
> > #8  0x00ce1297 in __cl_thread_pool_routine (context=0x80e9d5c) at 
> > cl_threadpool.c:74
> > #9  0x00ce0f61 in __cl_thread_wrapper (arg=0x8f1c690) at cl_thread.c:58
> > #10 0x00361371 in start_thread () from /lib/tls/libpthread.so.0
> > #11 0x001eaffe in clone () from /lib/tls/libc.so.6
> > 
> > 
> > > Also, why are osmtest failures allowed for "normal completion" ?
> > >
> > > -- Hal
> > >
> > >
> > >
> > > _______________________________________________
> > > openib-general mailing list
> > > openib-general at openib.org
> > > http://openib.org/mailman/listinfo/openib-general
> > >
> > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> > >   
> > 





More information about the general mailing list