[ewg] OpenSM from ofed-1.2 and ofed-1.3 clients
Hal Rosenstock
hrosenstock at xsigo.com
Tue Jun 3 10:13:07 PDT 2008
Steve,
One more thought below...
On Tue, 2008-06-03 at 09:49 -0700, Hal Rosenstock wrote:
> Steve,
>
> On Tue, 2008-06-03 at 11:19 -0500, Steve Wise wrote:
> > Hello opensm gurus:
> >
> > Sandia is seeing problems after migrating up to ofed-1.3. They are
> > still using an ofed-1.2 opensm but with ofed-1.3 clients, updated from
> > ofed-1.2.5.
>
> Was the OpenSM node changed in some way or only the end nodes ?
>
> > They are getting the errors below.
> >
> > Q: should this work? Or are the backwards compat issues?
>
> I haven't explictly tried it but I would think it should work.
>
> The errors below are timeouts on switch MFT sets which are only
> indirectly related to the end nodes (in that the MC SA joins cause the
> MC routing and those tables to be set) so I don't see the relationship
> but might be missing something.
>
> -- Hal
>
> > Thanks,
> >
> > Steve.
> >
> >
> >
> >
> > log:
> > > May 23 08:29:22 408613 [45007960] -> __osm_sm_mad_ctrl_send_err_cb:
> > > ERR 3113: MAD completed in error (IB_TIMEOUT)
> > > May 23 08:29:22 408622 [45007960] -> __osm_sm_mad_ctrl_send_err_cb:
> > > ERR 3119: Set method failed
> > > May 23 08:29:22 408652 [45007960] -> SMP dump:
> > > base_ver................0x1
> > > mgmt_class..............0x81
> > > class_ver...............0x1
> > > method..................0x2 (SubnSet)
> > > D bit...................0x0
> > > status..................0x0
> > > hop_ptr.................0x0
> > > hop_count...............0x3
> > > trans_id................0x1694a4
> > > attr_id.................0x1B
> > > (MulticastForwardingTable)
> > > resv....................0x0
> > > attr_mod................0x10000000
> > > m_key...................0x0000000000000000
> > > dr_slid.................0xFFFF
> > > dr_dlid.................0xFFFF
> > >
> > > Initial path: 0,1,14,9
Could this switch SMA be "stuck" ?
Could you try smpquery -D nodeinfo 0,1,14,9
and
smpquery -D nodeinfo 0,1,14
from the SM node ?
-- Hal
> > > Return path: 0,0,0,0
> > > Reserved: [0][0][0][0][0][0][0]
> > >
> > > 00 40 00 40 00 00 00 40 00 00 00 00
> > > 00 00 00 00
> > >
> > > 00 00 00 00 00 00 00 00 00 00 00 00
> > > 00 00 00 00
> > >
> > > 00 00 00 00 00 00 00 00 00 00 00 00
> > > 00 00 00 00
> > >
> > > 00 00 00 00 00 00 00 00 00 00 00 00
> > > 00 00 00 00
> > >
> > > May 23 08:29:22 408689 [45007960] -> umad_receiver: ERR 5409: send
> > > completed with error (method=0x2 attr=0x1B trans_id=0x14001694a5) --
> > > dropping
> > > May 23 08:29:22 408699 [45007960] -> umad_receiver: ERR 5411: DR SMP
> > > Hop Ptr: 0x0
> > > May 23 08:29:22 408711 [45007960] -> Received SMP on a 3 hop path:
> > > Initial path = 0,0,0,0
> > > Return path = 0,0,0,0
> > > May 23 08:29:22 408721 [45007960] -> __osm_sm_mad_ctrl_send_err_cb:
> > > ERR 3113: MAD completed in error (IB_TIMEOUT)
> > > May 23 08:29:22 408729 [45007960] -> __osm_sm_mad_ctrl_send_err_cb:
> > > ERR 3119: Set method failed
> > > May 23 08:29:22 408759 [45007960] -> SMP dump:
> > > base_ver................0x1
> > > mgmt_class..............0x81
> > > class_ver...............0x1
> > > method..................0x2 (SubnSet)
> > > D bit...................0x0
> > > status..................0x0
> > > hop_ptr.................0x0
> > > hop_count...............0x3
> > > trans_id................0x1694a5
> > > attr_id.................0x1B
> > > (MulticastForwardingTable)
> > > resv....................0x0
> > > attr_mod................0x1
> > > m_key...................0x0000000000000000
> > > dr_slid.................0xFFFF
> > > dr_dlid.................0xFFFF
> > >
> > > Initial path: 0,1,14,9
> > > Return path: 0,0,0,0
> > > Reserved: [0][0][0][0][0][0][0]
> > >
> > > 00 00 00 00 00 00 00 20 00 00 00 00
> > > 00 00 00 00
> > >
> > > 00 00 00 00 00 00 00 00 00 00 00 00
> > > 04 00 00 00
> > >
> > > 00 00 00 00 00 00 00 00 00 00 00 00
> > > 00 00 00 00
> > >
> > > 00 00 00 00 00 00 00 00 00 00 00 00
> > > 00 00 00 10
> > >
> > > May 23 08:29:22 412432 [42803960] -> Errors during initialization
> > > May 23 08:29:22 412508 [42803960] -> __osm_state_mgr_init_errors_msg:
> >
> > _______________________________________________
> > ewg mailing list
> > ewg at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
More information about the ewg
mailing list