[ofa-general] Re: [PATCH] infiniband-diags/src/ibnetdiscover.c: Fix the PortInfo data on the port we discover a switch on.

Hal Rosenstock hrosenstock at xsigo.com
Fri Jun 6 04:13:41 PDT 2008


Hi Ira,

On Thu, 2008-06-05 at 19:12 -0700, Ira Weiny wrote:
> I noticed over the last couple days that ibnetdiscover would print the
> incorrect speed on the port I would run ibnetdiscover from.  For example:
> 
> Switch  24 "S-0008f10400411f56"         # "SW1 wopr ISR9024D (MLX4 FW)" base port 0 lid 11 lmc 0
> ...
> [13]    "H-0002c90200219e64"[1](2c90200219e65)          # "wopri" lid 32 4xSDR
> 
>                                                                            ^^^
>                (Note from the switch side of things it thinks the speed is SDR.)
> ...
> Ca      2 "H-0002c90200219e64"          # "wopri"
> [1](2c90200219e65)      "S-0008f10400411f56"[13]                # lid 32 lmc 0 "SW1 wopr ISR9024D (MLX4 FW)" lid 11 4xDDR
> 
>                                                                                                                      ^^^
>                                                                                              (but here DDR.)

Yes, that is clearly wrong. Good find.

> It turns out that when you first discover a switch the port object created gets
> the PortInfo of Port "0".

Right; this trick works for peer CA and router ports but not switch
ports where the actual port number is needed.

> This patch queries again for the PortInfo of the port we "came in on".
> 
> Ira
> 
> 
> From 5bc66af276c7baabd4d66be9df0379271cb625b4 Mon Sep 17 00:00:00 2001
> From: Ira K. Weiny <weiny2 at llnl.gov>
> Date: Thu, 5 Jun 2008 19:03:44 -0700
> Subject: [PATCH] infiniband-diags/src/ibnetdiscover.c: Fix the PortInfo data on the port we discover a switch on.
> 
>    Previously the portinfo data for that port object would be the PortInfo of
>    port "0" of the switch.  Specifically this would cause the speed of the link
>    to be printed incorrectly.

Did you try back to back CAs too ?

Nit: this looks like 2 patches to me:
1. Factor out redundant code by adding decode_port_info routine
2. Fix link speed of local CA in switch display

> Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
> ---
>  infiniband-diags/src/ibnetdiscover.c |   33 +++++++++++++++++++--------------
>  1 files changed, 19 insertions(+), 14 deletions(-)
> 
> diff --git a/infiniband-diags/src/ibnetdiscover.c b/infiniband-diags/src/ibnetdiscover.c
> index b8d4e92..20da1ea 100644
> --- a/infiniband-diags/src/ibnetdiscover.c
> +++ b/infiniband-diags/src/ibnetdiscover.c
> @@ -129,6 +129,18 @@ node_type_str2(Node *node)
>  	return "??";
>  }
>  
> +void
> +decode_port_info(void *pi, Port *port)
> +{
> +	mad_decode_field(pi, IB_PORT_LID_F, &port->lid);
> +	mad_decode_field(pi, IB_PORT_LMC_F, &port->lmc);
> +	mad_decode_field(pi, IB_PORT_STATE_F, &port->state);
> +	mad_decode_field(pi, IB_PORT_PHYS_STATE_F, &port->physstate);
> +	mad_decode_field(pi, IB_PORT_LINK_WIDTH_ACTIVE_F, &port->linkwidth);
> +	mad_decode_field(pi, IB_PORT_LINK_SPEED_ACTIVE_F, &port->linkspeed);
> +}
> +
> +
>  int
>  get_port(Port *port, int portnum, ib_portid_t *portid)
>  {
> @@ -139,13 +151,7 @@ get_port(Port *port, int portnum, ib_portid_t *portid)
>  
>  	if (!smp_query(pi, portid, IB_ATTR_PORT_INFO, portnum, timeout))
>  		return -1;
> -
> -	mad_decode_field(pi, IB_PORT_LID_F, &port->lid);
> -	mad_decode_field(pi, IB_PORT_LMC_F, &port->lmc);
> -	mad_decode_field(pi, IB_PORT_STATE_F, &port->state);
> -	mad_decode_field(pi, IB_PORT_PHYS_STATE_F, &port->physstate);
> -	mad_decode_field(pi, IB_PORT_LINK_WIDTH_ACTIVE_F, &port->linkwidth);
> -	mad_decode_field(pi, IB_PORT_LINK_SPEED_ACTIVE_F, &port->linkspeed);
> +	decode_port_info(pi, port);
>  
>  	DEBUG("portid %s portnum %d: lid %d state %d physstate %d %s %s",
>  		portid2str(portid), portnum, port->lid, port->state, port->physstate, get_linkwidth_str(port->linkwidth), get_linkspeed_str(port->linkspeed));
> @@ -181,13 +187,7 @@ get_node(Node *node, Port *port, ib_portid_t *portid)
>  
>  	if (!smp_query(pi, portid, IB_ATTR_PORT_INFO, 0, timeout))
>  		return -1;
> -
> -	mad_decode_field(pi, IB_PORT_LID_F, &port->lid);
> -	mad_decode_field(pi, IB_PORT_LMC_F, &port->lmc);
> -	mad_decode_field(pi, IB_PORT_STATE_F, &port->state);
> -	mad_decode_field(pi, IB_PORT_PHYS_STATE_F, &port->physstate);
> -	mad_decode_field(pi, IB_PORT_LINK_WIDTH_ACTIVE_F, &port->linkwidth);
> -	mad_decode_field(pi, IB_PORT_LINK_SPEED_ACTIVE_F, &port->linkspeed);
> +	decode_port_info(pi, port);
>  
>  	if (node->type != SWITCH_NODE)
>  		return 0;
> @@ -195,6 +195,11 @@ get_node(Node *node, Port *port, ib_portid_t *portid)
>  	node->smalid = port->lid;
>  	node->smalmc = port->lmc;
>  
> +	/* after we have the sma information find out the real PortInfo for this port */
> +	if (!smp_query(pi, portid, IB_ATTR_PORT_INFO, node->localport, timeout))
> +	        return -1;
> +	decode_port_info(pi, port);
> +

Seems like this may only be needed when peer is switch but maybe it's
not worth saving those queries (I didn't look to see if the peer is
known here).

-- Hal

>          if (!smp_query(si, portid, IB_ATTR_SWITCH_INFO, 0, timeout))
>                  node->smaenhsp0 = 0;	/* assume base SP0 */
>  	else




More information about the general mailing list