[openib-general] opensm segfault?
Hal Rosenstock
halr at voltaire.com
Wed May 17 03:49:56 PDT 2006
On Wed, 2006-05-17 at 02:10, Eitan Zahavi wrote:
> cl_memcpy should have some debug capabilities on top of memcpy .
I don't see any. Did I miss something ?
..
> cl memory management provide means to track all memory allocations, etc.
Yes, there is extra memory tracking code for malloc and free. This is a
separable item in my mind right now.
-- Hal
> Eitan Zahavi
> Senior Engineering Director, Software Architect
> Mellanox Technologies LTD
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> P.O. Box 586 Yokneam 20692 ISRAEL
>
>
> > -----Original Message-----
> > From: openib-general-bounces at openib.org [mailto:openib-general-
> > bounces at openib.org] On Behalf Of Sasha Khapyorsky
> > Sent: Wednesday, May 17, 2006 2:11 AM
> > To: Troy Benjegerdes
> > Cc: openib-general at openib.org
> > Subject: Re: [openib-general] opensm segfault?
> >
> > Hi Troy,
> >
> > On 14:41 Tue 16 May , Troy Benjegerdes wrote:
> > > I got this after an indeterminate amount of time running opensm..
> >
> > May this be reproducible? Or it is completely random failure?
> >
> > > (gdb) bt
> > > #0 0x00002b90b0dbebf3 in cl_memcpy (p_dest=0x2aaaaac88850,
> p_src=0x0,
> > > count=64) at cl_memory_osd.c:87
> > > #1 0x0000000000415053 in osm_pkey_tbl_sync_new_blocks (
> > > p_pkey_tbl=0x2aaaaad99228) at osm_pkey.c:127
> > > #2 0x0000000000416687 in osm_pkey_mgr_process (p_osm=0x580e40)
> > > at osm_pkey_mgr.c:407
> > > #3 0x000000000043bb22 in osm_state_mgr_process (p_mgr=0x581ad8,
> > > signal=3)
> > > at osm_state_mgr.c:2243
> > > #4 0x000000000043c88f in __osm_state_mgr_ctrl_disp_callback (
> > > context=0x5819e8, p_data=0x3) at osm_state_mgr_ctrl.c:70
> > > #5 0x00002b90b0db9437 in __cl_disp_worker (context=0x5831f0)
> > > at cl_dispatcher.c:108
> > > #6 0x00002b90b0dc1ca3 in __cl_thread_pool_routine
> (context=0x583268)
> > > at cl_threadpool.c:78
> > > #7 0x00002b90b0dc1ae2 in __cl_thread_wrapper (arg=0x584750) at
> > > cl_thread.c:61
> > > #8 0x00002b90b0fe3b1c in start_thread () from /lib/libpthread.so.0
> > > #9 0x00002b90b12c8273 in clone () from /lib/libc.so.6
> > >
> > >
> > >
> > > And why the heck is "cl_memcpy" just a call to 'memcpy' anyway? This
> > > just seems like excessive uneeded abstraction.
> >
> > Absolutely agree with you.
> >
> > Sasha.
> >
> > > I'm running opensm from subversion rev 7091..
> > >
> > > May 10 16:27:53 145969 [0000] -> OpenSM Rev:openib-1.2.0 OpenIB svn
> > > 6251:7091M
> > >
> > > the only local changes are as follows:
> > >
> > > troy at opteron1:/usr/src/openib-src/userspace/management$ svn diff
> > > Index: osm/opensm/osm_port_info_rcv.c
> > > ===================================================================
> > > --- osm/opensm/osm_port_info_rcv.c (revision 7091)
> > > +++ osm/opensm/osm_port_info_rcv.c (working copy)
> > > @@ -469,9 +469,14 @@
> > > goto Exit;
> > > }
> > >
> > > +#if 0
> > > /* Check for IBM eHCA firmware defect in reporting partition
> > > * enforcement cap */
> > > if (cl_ntoh32(ib_node_info_get_vendor_id(&p_node->node_info))
> ==
> > > IBM_VENDOR_ID)
> > > p_switch->switch_info.enforce_cap = 0;
> > > +#endif
> > > + /* Check for busted divergenet switch on ameslab network */
> > > + if (cl_ntoh64(p_node->node_info.node_guid) ==
> 0x00084e0000000152)
> > > + p_switch->switch_info.enforce_cap = 0;
> > >
> > > /* Bail out if this is a switch with no partition enforcement
> > > * capability */
> > > if (cl_ntoh16(p_switch->switch_info.enforce_cap) == 0)
> > > _______________________________________________
> > > openib-general mailing list
> > > openib-general at openib.org
> > > http://openib.org/mailman/listinfo/openib-general
> > >
> > > To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
> > _______________________________________________
> > openib-general mailing list
> > openib-general at openib.org
> > http://openib.org/mailman/listinfo/openib-general
> >
> > To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
More information about the general
mailing list