[Users] odd subset manager behavior

Weiny, Ira ira.weiny at intel.com
Mon May 13 12:34:31 PDT 2013


> -----Original Message-----
> From: users-bounces at lists.openfabrics.org [mailto:users-
> Subject: Re: [Users] odd subset manager behavior
> 
> Here are the full logs from that execution of the subnet manager. It looks like
> it was functioning properly, intermittently, right? (I'm assuming since the
> intervals are irregular, and the fabric was continuing to function, it seems like
> the SM was probably more or less
> working)
> 
> Ira, what level of debug should I be setting? I'm running opensm with -D
> 0x40.

I would add ERROR and INFO at least.

#define OSM_LOG_ERROR	0x01
#define OSM_LOG_INFO	0x02

Ira

> thanks
>  -nld
> 
> 
> May 10 14:14:19 856909 [512F9700] 0x80 -> OpenSM 3.3.15 May 10 14:14:19
> 872752 [512F9700] 0x80 -> Entering DISCOVERING state May 10 14:14:20
> 844905 [4B0C9700] 0x80 -> Entering MASTER state May 10 14:14:21 604958
> [4B0C9700] 0x80 -> SUBNET UP May 10 18:19:41 103790 [4B0C9700] 0x80 ->
> Errors during initialization May 10 23:19:18 147739 [4B0C9700] 0x80 -> Errors
> during initialization May 11 00:09:29 119760 [4B0C9700] 0x80 -> Errors during
> initialization May 11 01:39:44 171770 [4B0C9700] 0x80 -> Errors during
> initialization May 11 01:40:35 187802 [4B0C9700] 0x80 -> Errors during
> initialization May 11 04:20:43 199806 [4B0C9700] 0x80 -> Errors during
> initialization May 11 06:17:38 399745 [4B0C9700] 0x80 -> Errors during
> initialization May 11 07:20:36 183738 [4B0C9700] 0x80 -> Errors during
> initialization May 11 07:42:16 207784 [4B0C9700] 0x80 -> Errors during
> initialization May 11 08:07:02 199763 [4B0C9700] 0x80 -> Errors during
> initialization May 11 12:00:18 243758 [4B0C9700] 0x80 -> Errors during
> initialization May 11 14:05:18 295779 [4B0C9700] 0x80 -> Errors during
> initialization May 11 15:38:14 299780 [4B0C9700] 0x80 -> Errors during
> initialization May 11 16:03:30 503760 [4B0C9700] 0x80 -> Errors during
> initialization May 11 16:16:01 295739 [4B0C9700] 0x80 -> Errors during
> initialization May 11 17:37:38 303782 [4B0C9700] 0x80 -> Errors during
> initialization May 11 17:38:42 327748 [4B0C9700] 0x80 -> Errors during
> initialization May 11 22:01:49 339753 [4B0C9700] 0x80 -> Errors during
> initialization May 11 22:02:19 379695 [4B0C9700] 0x80 -> Errors during
> initialization May 12 03:04:48 403771 [4B0C9700] 0x80 -> Errors during
> initialization May 12 03:31:22 403752 [4B0C9700] 0x80 -> Errors during
> initialization May 12 03:44:41 415740 [4B0C9700] 0x80 -> Errors during
> initialization May 12 10:35:59 475849 [4B0C9700] 0x80 -> Errors during
> initialization May 12 10:41:39 467814 [4B0C9700] 0x80 -> Errors during
> initialization May 12 11:27:02 471770 [4B0C9700] 0x80 -> Errors during
> initialization May 12 13:34:57 467829 [4B0C9700] 0x80 -> Errors during
> initialization May 12 14:33:43 487784 [4B0C9700] 0x80 -> Errors during
> initialization May 12 14:35:56 507728 [4B0C9700] 0x80 -> Errors during
> initialization May 12 14:39:56 499739 [4B0C9700] 0x80 -> Errors during
> initialization May 12 15:37:38 687764 [4B0C9700] 0x80 -> Errors during
> initialization May 12 18:06:41 531744 [4B0C9700] 0x80 -> Errors during
> initialization May 12 18:32:50 551773 [4B0C9700] 0x80 -> Errors during
> initialization May 12 18:54:21 511818 [4B0C9700] 0x80 -> Errors during
> initialization May 12 19:16:14 527799 [4B0C9700] 0x80 -> Errors during
> initialization May 13 01:34:58 583765 [4B0C9700] 0x80 -> Errors during
> initialization May 13 02:25:13 615760 [4B0C9700] 0x80 -> Errors during
> initialization May 13 05:16:22 611734 [4B0C9700] 0x80 -> Errors during
> initialization May 13 05:22:09 603862 [4B0C9700] 0x80 -> Errors during
> initialization May 13 05:56:45 851842 [4B0C9700] 0x80 -> Errors during
> initialization May 13 06:15:47 851775 [4B0C9700] 0x80 -> Errors during
> initialization May 13 13:30:24 290706 [512F9700] 0x80 -> Exiting SM
> 
> On Mon, May 13, 2013 at 2:02 PM, Hal Rosenstock
> <hal.rosenstock at gmail.com> wrote:
> >
> >
> > On Mon, May 13, 2013 at 2:50 PM, Narayan Desai
> > <narayan.desai at gmail.com>
> > wrote:
> >>
> >> Our subnet manager started producing some weird error messages:
> >> May 13 05:16:22 611734 [4B0C9700] 0x80 -> Errors during
> >> initialization May 13 05:22:09 603862 [4B0C9700] 0x80 -> Errors
> >> during initialization May 13 05:56:45 851842 [4B0C9700] 0x80 ->
> >> Errors during initialization May 13 06:15:47 851775 [4B0C9700] 0x80
> >> -> Errors during initialization
> >>
> >> The subnet manager was actually up and running prior to this, and had
> >> successfully configured the network:
> >> May 10 14:14:19 856909 [512F9700] 0x80 -> OpenSM 3.3.15 May 10
> >> 14:14:19 872752 [512F9700] 0x80 -> Entering DISCOVERING state May 10
> >> 14:14:20 844905 [4B0C9700] 0x80 -> Entering MASTER state May 10
> >> 14:14:21 604958 [4B0C9700] 0x80 -> SUBNET UP May 10 18:19:41 103790
> >> [4B0C9700] 0x80 -> Errors during initialization May 10 23:19:18
> >> 147739 [4B0C9700] 0x80 -> Errors during initialization May 11
> >> 00:09:29 119760 [4B0C9700] 0x80 -> Errors during initialization
> >>
> >> Any clue what is causing this?
> >
> >
> > It means some critical set to configure the subnet failed.
> >
> > Are there error messages are in the opensm log ?
> >
> > -- Hal
> >
> >>
> >> thanks.
> >>  -nld
> >> _______________________________________________
> >> Users mailing list
> >> Users at lists.openfabrics.org
> >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users
> >
> >
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users



More information about the Users mailing list