[ofa-general] RE: [PATCH] osm: default leaf vl_stall and head_of_queue_lifecounters

Amit Krig amitk at mellanox.co.il
Tue Aug 7 00:24:09 PDT 2007


Hi Sasha

In the current switch configurations if you have a single drop (By HOQ),
the line will be flashed for 128 Mili seconds, I think that this penalty
for ~8mili hiccup (as the HOQ can count ~50% less then configured) is
not a good default,

Amit



-----Original Message-----
From: Sasha Khapyorsky [mailto:sashak at voltaire.com] 
Sent: Tuesday, August 07, 2007 3:13 AM
To: kliteyn at dev.mellanox.co.il
Cc: OpenIB; Amit Krig
Subject: Re: [PATCH] osm: default leaf vl_stall and
head_of_queue_lifecounters

Hi Yevgeny,

On Tue, 2007-08-07 at 00:05 +0300, Yevgeny Kliteynik wrote:
> Hi Sasha
> 
> This patch changes the OSM_DEFAULT_LEAF_VL_STALL_COUNT and 
> OSM_DEFAULT_LEAF_HEAD_OF_QUEUE_LIFE (matching them to the default 
> counters on the switch-to-switch ports),

But why this should be equal to switch-to-switch port values?

BTW HoQLife=0x12 (18) is >1 sec. Isn't it _huge_ for HoQLife?

>  in order
> to deal with the casual PCI Express "hiccups".

Could you provide more details about the problem?


Sasha

> Please apply it to ofed_1_2 and master.
> 
> --
> Yevgeny
> 
> Signed-off-by: Yevgeny Kliteynik <kliteyn at dev.mellanox.co.il>
> ---
>  osm/include/opensm/osm_base.h |    7 +++----
>  1 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/osm/include/opensm/osm_base.h 
> b/osm/include/opensm/osm_base.h index b38b511..0cb4651 100644
> --- a/osm/include/opensm/osm_base.h
> +++ b/osm/include/opensm/osm_base.h
> @@ -311,11 +311,11 @@ BEGIN_C_DECLS
>  * DESCRIPTION
>  *	Sets the time a packet can live in the head of the VL Queue
>  *  of a port that drives a CA port.
> -*  We use here the value of ~130usec
> +*  We use here the value of ~1sec
>  *
>  * SYNOPSIS
>  */
> -#define OSM_DEFAULT_LEAF_HEAD_OF_QUEUE_LIFE 0xC
> +#define OSM_DEFAULT_LEAF_HEAD_OF_QUEUE_LIFE 0x12
>  /***********/
>  
>  /****d* OpenSM: Base/OSM_DEFAULT_VL_STALL_COUNT @@ -341,11 +341,10 @@

> BEGIN_C_DECLS
>  *  puts the VL into stalled state. In stalled state, the port is 
> supposed
>  *  to drop everything for 8*(head of queue lifetime). This value is 
> for
>  *  switch ports driving a CA port.
> -*  We use the value of 1 here - so any drop due to HOQ means stalling

> the VL
>  *
>  * SYNOPSIS
>  */
> -#define OSM_DEFAULT_LEAF_VL_STALL_COUNT 0x1
> +#define OSM_DEFAULT_LEAF_VL_STALL_COUNT 0x7
>  /***********/
>  
>  /****d* OpenSM: Base/OSM_DEFAULT_TRAP_SUPRESSION_TIMEOUT



More information about the general mailing list