[openib-general] RE: [PATCH] OpenSM: Extend default transaction timeout from 100msec to 1 second

Eitan Zahavi eitan at mellanox.co.il
Wed Jan 4 06:04:31 PST 2006


Hi Hal

Regarding timeout 200msec is fine with me.
Regarding maxsmps - I think it is better to have 1 SMP on the wire. 

Eitan Zahavi
Design Technology Director
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL


> -----Original Message-----
> From: Hal Rosenstock [mailto:halr at voltaire.com]
> Sent: Wednesday, January 04, 2006 3:25 PM
> To: Eitan Zahavi
> Cc: Yael Kalka; openib-general at openib.org
> Subject: RE: [PATCH] OpenSM: Extend default transaction timeout from
100msec to 1
> second
> 
> Hi Eitan,
> 
> On Tue, 2005-12-20 at 16:27, Eitan Zahavi wrote:
> > Hi Hal,
> >
> > The effect is basically a slowdown in case of non responding or lost
> > packets.
> > With 1sec timeout - up to 4sec per lost transaction are added to the
SM
> > bringup time.
> >
> > In many clusters I have seen a 100msec was enough - but I guess you
have
> > actually have seen such failures.
> 
> I see that the timeout is set to 200 msec (and maxsmps 0) in the
> Mellanox OpenSM configuration. Do you have a problem with increasing
the
> default from 100 to 200 msec (and also changing default maxsmps to 0)
?
> 
> -- Hal
> 
> > Eitan Zahavi
> > Design Technology Director
> > Mellanox Technologies LTD
> > Tel:+972-4-9097208
> > Fax:+972-4-9593245
> > P.O. Box 586 Yokneam 20692 ISRAEL
> >
> >
> > > -----Original Message-----
> > > From: Hal Rosenstock [mailto:halr at voltaire.com]
> > > Sent: Tuesday, December 20, 2005 3:38 PM
> > > To: Yael Kalka; Eitan Zahavi
> > > Cc: openib-general at openib.org
> > > Subject: [PATCH] OpenSM: Extend default transaction timeout from
100
> > msec to 1
> > > second
> > >
> > > OpenSM: Extend default transaction timeout from 100 msec to 1
second.
> > >
> > > With the advent of long distance IB and software SMAs, 100 msec is
no
> > > longer adaquete as a default transaction timeout. Increase this to
1
> > > second which so that the default is sufficient in most common
cases.
> > >
> > > Signed-off-by: Hal Rosenstock <halr at voltaire.com>
> > >
> > > Index: include/opensm/osm_base.h
> > >
===================================================================
> > > --- include/opensm/osm_base.h	(revision 4549)
> > > +++ include/opensm/osm_base.h	(working copy)
> > > @@ -246,7 +246,7 @@ BEGIN_C_DECLS
> > >  *
> > >  * SYNOPSIS
> > >  */
> > > -#define OSM_DEFAULT_TRANS_TIMEOUT_MILLISEC 100
> > > +#define OSM_DEFAULT_TRANS_TIMEOUT_MILLISEC 1000
> > >  /***********/
> > >
> > >  /****d* OpenSM: Base/OSM_DEFAULT_SUBNET_TIMEOUT
> > > Index: opensm/main.c
> > >
===================================================================
> > > --- opensm/main.c	(revision 4549)
> > > +++ opensm/main.c	(working copy)
> > > @@ -153,7 +153,7 @@ show_usage(void)
> > >            "          used for transaction timeouts.\n"
> > >            "          Specifying -t 0 disables timeouts.\n"
> > >            "          Without -t, OpenSM defaults to a timeout
value
> > of\n"
> > > -          "          100 milliseconds.\n\n" );
> > > +          "          1 second (1000 milliseconds).\n\n" );
> > >    printf( "-maxsmps <number>\n"
> > >            "          This option specifies the number of VL15 SMP
> > MADs\n"
> > >            "          allowed on the wire at any one time.\n"
> >



More information about the general mailing list