[openib-general] nightly osm_sim report 2006-12-30:normal completion
Hal Rosenstock
halr at voltaire.com
Mon Jan 1 07:13:05 PST 2007
On Sat, 2006-12-30 at 17:25, Hal Rosenstock wrote:
> On Sat, 2006-12-30 at 16:33, Eitan Zahavi wrote:
> > Hal Rosenstock wrote:
> > > Hi Eitan,
> > >
> > > On Sat, 2006-12-30 at 00:25, Eitan Zahavi wrote:
> > >
> > >> OSM Simulation Regression Summary
> > >> OpenSM rev = Fri_Dec_29_12:19:08_2006 2e0f81
> > >> ibutils rev = Wed_Dec_27_23:39:30_2006 60aebe
> > >> Total=405 Pass=330 Fail=75
> > >>
> > >> Pass:
> > >> 45 Stability IS1-16.topo
> > >> 45 Pkey IS1-16.topo
> > >> 45 OsmStress IS1-16.topo
> > >> 45 Multicast IS1-16.topo
> > >> 45 LidMgr IS1-16.topo
> > >> 15 Stability IS3-loop.topo
> > >> 15 Stability IS3-128.topo
> > >> 15 Pkey IS3-128.topo
> > >> 15 OsmStress IS3-128.topo
> > >> 15 Multicast IS3-loop.topo
> > >> 15 Multicast IS3-128.topo
> > >> 15 LidMgr IS3-128.topo
> > >>
> > >> Failures:
> > >> 45 OsmTest IS1-16.topo
> > >> 15 OsmTest IS3-loop.topo
> > >> 15 OsmTest IS3-128.topo
> > >>
> > >
> > > Any idea on these osmtest failures ? I did add SA MFTRecord yesterday
> > > and made a change to SA LFTRecord and SwitchInfoRecord the day before as
> > > well as additional osmtests for MFTRecord and LFTRecord.
> > >
> > Actually I get a core dump:
>
> Thanks for providing this!
>
> > #0 0x0805c265 in osm_mcast_tbl_get_block (p_tbl=0x8f6ef6c,
> > block_num=-32575, position=0 '\0', p_block=0xb19e4d2c)
> > at osm_mcast_tbl.c:299
> > 299 p_block[i] = (*p_tbl->p_mask_tbl)[mlid_start_ho + i][position];
> >
> > (gdb) p i
> > $1 = 2
> > (gdb) p mlid_start_ho
> > $2 = 6176
> > (gdb) p position
> > $3 = 0 '\0'
> > (gdb) where
> > #0 0x0805c265 in osm_mcast_tbl_get_block (p_tbl=0x8f6ef6c,
> > block_num=-32575, position=0 '\0', p_block=0xb19e4d2c)
> > at osm_mcast_tbl.c:299
> > #1 0x08073d29 in osm_switch_get_mft_block (p_sw=0x8f6eed8,
> > block_num=32961, position=0 '\0', p_block=0xb19e4d2c)
> > at ./../include/opensm/osm_switch.h:1074
> > #2 0x08073b8c in __osm_mftr_rcv_new_mftr (p_rcv=0x80e9a6c,
> > p_sw=0x8f6eed8, p_list=0xb61c0370, lid=512, block=32961,
> ^^^^^
> max block number is 511 so this is what caused the core dump.
> I just checked in a patch for this which should work.
It didn't work.
Can you dump p_sw->mcast_tbl ?
Thanks.
-- Hal
>
> -- Hal
>
> > position=0 '\0') at osm_sa_mft_record.c:181
> > #3 0x08074273 in __osm_mftr_rcv_by_comp_mask (p_map_item=0x8f6eed8,
> > context=0xb61c0330) at osm_sa_mft_record.c:317
> > #4 0x00cd9747 in cl_qmap_apply_func (p_map=0x80e8584,
> > pfn_func=0x8073f98 <__osm_mftr_rcv_by_comp_mask>, context=0xb61c0330)
> > at cl_map.c:287
> > #5 0x08074653 in osm_mftr_rcv_process (p_rcv=0x80e9a6c,
> > p_madw=0x8f29f0c) at osm_sa_mft_record.c:390
> > #6 0x08074ef2 in __osm_mftr_rcv_ctrl_disp_callback (context=0x80e9afc,
> > p_data=0x8f29f0c) at osm_sa_mft_record_ctrl.c:63
> > #7 0x00cd3d4f in __cl_disp_worker (context=0x80e9d18) at
> > cl_dispatcher.c:102
> > #8 0x00ce1297 in __cl_thread_pool_routine (context=0x80e9d5c) at
> > cl_threadpool.c:74
> > #9 0x00ce0f61 in __cl_thread_wrapper (arg=0x8f1c690) at cl_thread.c:58
> > #10 0x00361371 in start_thread () from /lib/tls/libpthread.so.0
> > #11 0x001eaffe in clone () from /lib/tls/libc.so.6
> >
> >
> > > Also, why are osmtest failures allowed for "normal completion" ?
> > >
> > > -- Hal
> > >
> > >
> > >
> > > _______________________________________________
> > > openib-general mailing list
> > > openib-general at openib.org
> > > http://openib.org/mailman/listinfo/openib-general
> > >
> > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> > >
> >
More information about the general
mailing list