[openib-general] RE: [PATCH] OpenSM: Extend default transaction timeout from 100 msec to 1 second

Hal Rosenstock halr at voltaire.com
Wed Jan 4 05:25:07 PST 2006


Hi Eitan,

On Tue, 2005-12-20 at 16:27, Eitan Zahavi wrote:
> Hi Hal,
> 
> The effect is basically a slowdown in case of non responding or lost
> packets.
> With 1sec timeout - up to 4sec per lost transaction are added to the SM
> bringup time.
> 
> In many clusters I have seen a 100msec was enough - but I guess you have
> actually have seen such failures.

I see that the timeout is set to 200 msec (and maxsmps 0) in the
Mellanox OpenSM configuration. Do you have a problem with increasing the
default from 100 to 200 msec (and also changing default maxsmps to 0) ?

-- Hal

> Eitan Zahavi
> Design Technology Director
> Mellanox Technologies LTD
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> P.O. Box 586 Yokneam 20692 ISRAEL
> 
> 
> > -----Original Message-----
> > From: Hal Rosenstock [mailto:halr at voltaire.com]
> > Sent: Tuesday, December 20, 2005 3:38 PM
> > To: Yael Kalka; Eitan Zahavi
> > Cc: openib-general at openib.org
> > Subject: [PATCH] OpenSM: Extend default transaction timeout from 100
> msec to 1
> > second
> > 
> > OpenSM: Extend default transaction timeout from 100 msec to 1 second.
> > 
> > With the advent of long distance IB and software SMAs, 100 msec is no
> > longer adaquete as a default transaction timeout. Increase this to 1
> > second which so that the default is sufficient in most common cases.
> > 
> > Signed-off-by: Hal Rosenstock <halr at voltaire.com>
> > 
> > Index: include/opensm/osm_base.h
> > ===================================================================
> > --- include/opensm/osm_base.h	(revision 4549)
> > +++ include/opensm/osm_base.h	(working copy)
> > @@ -246,7 +246,7 @@ BEGIN_C_DECLS
> >  *
> >  * SYNOPSIS
> >  */
> > -#define OSM_DEFAULT_TRANS_TIMEOUT_MILLISEC 100
> > +#define OSM_DEFAULT_TRANS_TIMEOUT_MILLISEC 1000
> >  /***********/
> > 
> >  /****d* OpenSM: Base/OSM_DEFAULT_SUBNET_TIMEOUT
> > Index: opensm/main.c
> > ===================================================================
> > --- opensm/main.c	(revision 4549)
> > +++ opensm/main.c	(working copy)
> > @@ -153,7 +153,7 @@ show_usage(void)
> >            "          used for transaction timeouts.\n"
> >            "          Specifying -t 0 disables timeouts.\n"
> >            "          Without -t, OpenSM defaults to a timeout value
> of\n"
> > -          "          100 milliseconds.\n\n" );
> > +          "          1 second (1000 milliseconds).\n\n" );
> >    printf( "-maxsmps <number>\n"
> >            "          This option specifies the number of VL15 SMP
> MADs\n"
> >            "          allowed on the wire at any one time.\n"
> 




More information about the general mailing list