[ofw] RE: HPC head-node slow down when OpenSM is started on the head-node (RC4, svn.1691).

Tzachi Dar tzachid at mellanox.co.il
Mon Oct 27 15:28:29 PDT 2008


Hi,

This is a bug that can come from a few reasons (see bellow) and we have
been able to reproduce it here although we are still having different
issues. If you can find out how to reproduce it without another SM that
could be great.

As to your problem: Generally speaking the code is stacked at
ipoib_port_up. There is an assumption that ipoib_port_up is called after
ipoib_port_down. 99% that you have found a flow in which this is not the
case. On port down we close the QPs so that all packets that have been
placed for receive are freed.

Taking one step up, it seems that we are having a problem in the plug
and play mechanism (IBAL). In general you are being called from the
function __ipoib_pnp_cb. This function is quite complicated, as it takes
into account the current state of ipoib, and also the different events.

In order for me to have more information, pleases send me a log with
printing at the following places:
Please add their the following print at __ipoib_pnp_cb
(As soon as possibale):
	IPOIB_PRINT( TRACE_LEVEL_ERROR, IPOIB_DBG_PNP,
		("p_pnp_rec->pnp_event = 0x%x (%s) object state %s\n",
		p_pnp_rec->pnp_event, ib_get_pnp_event_str(
p_pnp_rec->pnp_event ), ib_get_pnp_event_str(Adapter->state)) );

On the exit from this function please also print the same line (I want
to see the state changes).

Please add a printing in ipoib_port_up, ipoib_port_down ,

And send us the log. I hope that I'll be able to figure what is going on
there.

And again a simple repro will help (probably even more).

Thanks
Tzachi



> -----Original Message-----
> From: Smith, Stan [mailto:stan.smith at intel.com] 
> Sent: Monday, October 27, 2008 10:14 PM
> To: Smith, Stan; Tzachi Dar; Leonid Keller; Fab Tillier
> Cc: Ishai Rabinovitz; ofw at lists.openfabrics.org
> Subject: RE: HPC head-node slow down when OpenSM is started 
> on the head-node (RC4,svn.1691).
> 
> Hello,
>  Further debug operations with src mods, I see the offending 
> call to ipoib_port_up() with a valid port_p pointer. Where I 
> see the failure is in ipoib_port.c @ ~line 1584
> 
>         /* Wait for all work requests to get flushed. */
>         while( p_port->recv_mgr.depth || p_port->send_mgr.depth )
>                 cl_thread_suspend( 0 );
> 
> recv_mgr.depth == 512, send_mgr.depth = 0
> 
> Thoughts on why recv_mgr.depth would be so high?
> What is preventing the recv mgr from processing work requests?
> 
> 
> Bus_pnp.c
>   ExAcquireFastMutexUnsafe() --> ExAcquireFastMutex() + ExRelease....
> 
> Ipoib_port.c add_local locking added
> 
> --- ipoib_port.c        2008-10-27 10:18:44.882358400 -0700
> +++ ipoib_port.c.new    2008-10-27 13:06:11.021042300 -0700
> @@ -5303,9 +5303,7 @@
>         }
> 
>         /* __endpt_mgr_insert expects *one* reference to be held. */
> -       cl_atomic_inc( &p_port->endpt_rdr );
> -       status = __endpt_mgr_insert( p_port, 
> p_port->p_adapter->params.conf_mac, p_endpt );
> -       cl_atomic_dec( &p_port->endpt_rdr );
> +       status = __endpt_mgr_insert_locked( p_port, 
> + p_port->p_adapter->params.conf_mac, p_endpt );
>         if( status != IB_SUCCESS )
>         {
>                 IPOIB_PRINT_EXIT( TRACE_LEVEL_ERROR, IPOIB_DBG_ERROR,
> 
> 
> Stan.
> 
> Smith, Stan wrote:
> > Hello All,
> >   Below are the results of snooping around with the 
> debugger connected 
> > to the head-node which is operating in the OpenSM induced slow-down 
> > mode.
> >
> > Interesting item is the call to ipoib_port_up() with p_port == 0, 
> > looks to be a problem; Clobbered stack?
> > The captured windbg story is attached.
> >
> > Possible results of not holding the port lock from __bcast_cb()?
> >
> > Please advise on further debug.
> >
> > Stan.
> >
> > nt!DbgBreakPointWithStatus
> > nt!wctomb+0x4cbf
> > nt!KeUpdateSystemTime+0x21f (TrapFrame @ fffffa60`022e9840) 
> > nt!KeReleaseInStackQueuedSpinLock+0x2d
> > nt!KeDelayExecutionThread+0x72c
> > ipoib!ipoib_port_up(struct _ipoib_port * p_port = 
> 0x00000000`00000000, 
> > struct _ib_pnp_port_rec * p_pnp_rec =
> > 0xfffffa60`024be780)+0x79
> > 
> [f:\openib-windows-svn\wof2-0\rc4\trunk\ulp\ipoib\kernel\ipoib_port.c
> > @ 5186]
> > ipoib!__ipoib_pnp_cb(struct _ib_pnp_rec * p_pnp_rec = 
> > 0xfffffa60`024be780)+0x20d 
> > 
> [f:\openib-windows-svn\wof2-0\rc4\trunk\ulp\ipoib\kernel\ipoib_adapter
> > .c
> > @ 678]
> > ibbus!__pnp_notify_user(struct _al_pnp * p_reg = 
> 0xfffffa80`05262d90, 
> > struct _al_pnp_context * p_context = 0xfffffa60`024be110, struct 
> > _al_pnp_ca_event * p_event_rec = 0xfffffa80`08b65108)+0x17b 
> > 
> [f:\openib-windows-svn\wof2-0\rc4\trunk\core\al\kernel\al_pnp.
> c @ 557] 
> > ibbus!__pnp_process_port_forward(struct _al_pnp_ca_event * 
> p_event_rec 
> > = 0x00000000`00000000)+0xa6 
> > [f:\openib-windows-svn\wof2-0\rc4\trunk\core\al\kernel\al_pnp.c @ 
> > 1279] ibbus!__pnp_check_ports(struct _al_ci_ca * p_ci_ca = 
> > 0xfffffa80`04bcc8c0, struct _ib_ca_attr * p_old_ca_attr = 
> > 0x00000000`00000001)+0x14d 
> > [f:\openib-windows-svn\wof2-0\rc4\trunk\core\al\kernel\al_pnp.c @ 
> > 1371] ibbus!__pnp_check_events(struct _cl_async_proc_item * p_item =
> > 0xfffffa80`04bc3e98)+0x171
> > [f:\openib-windows-svn\wof2-0\rc4\trunk\core\al\kernel\al_pnp.c @ 
> > 1566] ibbus!__cl_async_proc_worker(void * context =
> > 0xfffffa80`04bc3d60)+0x61
> > [f:\openib-windows-svn\wof2-0\rc4\trunk\core\complib\cl_async_proc.c
> > @ 153]
> > ibbus!__cl_thread_pool_routine(void * context =
> > 0xfffffa80`04bc3860)+0x41
> > [f:\openib-windows-svn\wof2-0\rc4\trunk\core\complib\cl_threadpool.c
> > @ 66]
> > ibbus!__thread_callback(struct _cl_thread * p_thread =
> > 0x00380031`00430032)+0x28
> > 
> [f:\openib-windows-svn\wof2-0\rc4\trunk\core\complib\kernel\cl_thread.
> > c
> > @ 49]
> > nt!ProbeForRead+0xbd3
> > nt!_misaligned_access+0x4f6
> >
> >
> >
> >
> > Smith, Stan wrote:
> >> Gentlemen,
> >>   The HPC head-node slow down is back with a 
> vengeance.....instead of 
> >> only 24% of available Processor cycles we are now up to 
> 31%. Needless 
> >> to say the system is unusable. Along the path of attaching 
> a debugger 
> >> I've learned the slow-down is only caused by OpenSM, as I 
> was able to 
> >> shutdown OpenSM and the slow-down persisted.
> >>
> >> The story...
> >>
> >> A functional 15 node HPC system without OpenSM on the 
> head-node; SM 
> >> supplied by SilverStorm switch with embedded SM.
> >> On the head-node, Run Server manager, changing OpenSM startup from 
> >> 'Disabled' to 'manual'. Disconnect Silver Storm IB switch as it's 
> >> daisy-chained to the Mellanox switch which connects all 
> HPC nodes; at 
> >> this point no SM is running on the fabric. From the 
> head-node server 
> >> manager, 'Start' OpenSM.
> >> Wait 20 seconds or so, pop open the task manager 
> Performance view - 
> >> notice large % of CPU utilization.
> >> Once the system starts running slow....from the head-node server 
> >> manager, 'Stop' OpenSM. CPU utilization is still high.
> >> Reconnect SilverStorm switch + SM.
> >> CPU utilization is still high?
> >> Going to the head-node debugger, breaking in and showing processes 
> >> and threads revealed little useful info?
> >> Debugger command suggestions ?
> >> Will try a checked version of ipoib.sys tomorrow.
> >>
> >> Stan.
> >>
> >> BTW, I did see a shutdown BSOD with a minidump that showed 
> >> ipoib!__cl_asynch_processor( 0 ) being the faulting call.
> >> Dereferencing the *context is what caused the BSOD.
> >>
> >>
> >> Smith, Stan wrote:
> >>> Hello,
> >>>
> >>> The good news is OpenSM is working nicely on all WinOS 
> flavors. The 
> >>> not so good news is OpenSM on the head-node of HPC 
> consumes 25% of 
> >>> the system; win2k8 works fine running OpenSM.
> >>>
> >>> On our 15 node HPC cluster, if pre_RC4 OpenSM is started during a 
> >>> WinOF install or if opensm is started on the head-node after the 
> >>> WinOF install (OpenSM not started during the install), 
> the task-bar 
> >>> network icon right-click and selection of Network and 
> Sharing center 
> >>> fails to reach the Network and Sharing manger.
> >>> The best we see is the NSM GUI windows pop open and remains blank 
> >>> (white). The rest of the system is functional in that command 
> >>> windows are OK, start menu OK but you are certain to hang 
> a window 
> >>> if you access the network via a GUI interface. A command 
> window can 
> >>> set the IPoIB IPv4 address via net set address and ipconfig works?
> >>> <Cntrl-Alt-Del>->resource manager shows about 25% of the system
> >>> (4-cores) is running the NT kernel, followed by network services.
> >>> I'm guessing massive amounts of system calls from a driver?
> >>>
> >>> We first started noticing similar behavior with RC2. 
> Starting OpenSM 
> >>> during an install always failed (Caused system 
> slow-down). Although 
> >>> if you started OpenSM after the install, the head-node was OK.
> >>> RC3 behaved likewise.
> >>> With pre_RC4 (svn.1661) the head-node now slows down when 
> OpenSM is 
> >>> started after the install or if OpenSM is started during 
> the WinOF 
> >>> install.
> >>>
> >>> Again, all other WinOS flavors work fine with OpenSM 
> started during 
> >>> the install or afterwards. HPC works fine with the SilverStorm 
> >>> embedded SM switch. I strongly suspect HPC head-node 
> would work fine 
> >>> if OpenSM were run from another Windows/Linux system.
> >>>
> >>> Thoughts or suggestions on further diagnosis as to why running 
> >>> OpenSM causes HPC head-node such a slow down? Part of the 
> story may 
> >>> have something to do with the number of HPC compute nodes.
> >>>
> >>> Any chance you could run OpenSM on your HPC head node to 
> see if you 
> >>> see similar behavior?
> >>>
> >>> Thanks,
> >>>
> >>> Stan.
> 
> 



More information about the ofw mailing list