[ofa-general] Re: [PATCH 0/3] ib/ipoib: Enable IPoIB-UD 4K MTU support

Or Gerlitz ogerlitz at voltaire.com
Wed Jan 30 23:32:28 PST 2008


Shirley Ma wrote:
> The node IPoIB link MTU size is the minimum value of admin configurable
> MTU through ifconfig and IPoIB default broadcast group MTU size. When
> Subnet Manager enables default broadcast group during start up, this
> subnet IPoIB link MTU will be the value of default broadcast group MTU
> size. For any node IB MTU smaller than this value, the node can't join
> this IPoIB subnet. For any node IB MTU is greater than this value, the
> node will join this IPoIB subnet and this value will be set as its IPOIB
> link MTU. If Subnet Manager disables default broadcast group during
> start up, the first bring up node in this subnet will create the default
> IPoIB broadcast group based on the negotiation with the Subnet Manager,
> the default is currently set as 2K according to IPoIB RFC.

Hi Shirley,

Just to make sure, can you confirm that this patch set is not dependent 
on the below patch which is part of ofed but was never submitted to the 
upstream ipoib driver for inclusion?

Also, can you share with what SM have you checked this, did you had to 
patch or run it with non-default param, more, what was the 
configuration, specifically what switch was used and any instrumentation 
you have made to the switch FW, thanks.

Or

> IB/ipoib: user appropriate mtu selector for path queries
> 
> IPoIB must set mtu selector in path record query according to dev->mtu:
> if we wildcard it, SM can select a path with lower MTU.
> This breaks IPoIB on networks with SM Tavor quirk activates.
> 
> We can always require this, since IPoIB spec includes the following statement:
>     The value (for IB MTU) assigned to the broadcast-GID must not
>     be greater than any physical link MTU spanned by the IPoIB
>     subnet.
> 
> Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>
> 
> ---
> 
> Note the following uses IB_SA_GT so it should be applied on top of SA
> enum rename.
> 
> Index: ofed_1_1/drivers/infiniband/ulp/ipoib/ipoib_main.c
> ===================================================================
> --- ofed_1_1.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c
> +++ ofed_1_1/drivers/infiniband/ulp/ipoib/ipoib_main.c
> @@ -182,6 +182,8 @@ static int ipoib_change_mtu(struct net_d
>  
>  	dev->mtu = min(priv->mcast_mtu, priv->admin_mtu);
>  
> +	queue_work(ipoib_workqueue, &priv->flush_task);
> +
>  	return 0;
>  }
>  
> @@ -452,15 +454,39 @@ static int path_rec_start(struct net_dev
>  			  struct ipoib_path *path)
>  {
>  	struct ipoib_dev_priv *priv = netdev_priv(dev);
> +	ib_sa_comp_mask comp_mask = IB_SA_PATH_REC_MTU_SELECTOR | IB_SA_PATH_REC_MTU;
> +
> +	path->pathrec.mtu_selector = IB_SA_GT;
>  
> -	ipoib_dbg(priv, "Start path record lookup for " IPOIB_GID_FMT "\n",
> -		  IPOIB_GID_ARG(path->pathrec.dgid));
> +	switch (roundup_pow_of_two(dev->mtu + IPOIB_ENCAP_LEN)) {
> +	case 512:
> +		path->pathrec.mtu = IB_MTU_256;
> +		break;
> +	case 1024:
> +		path->pathrec.mtu = IB_MTU_512;
> +		break;
> +	case 2048:
> +		path->pathrec.mtu = IB_MTU_1024;
> +		break;
> +	case 4096:
> +		path->pathrec.mtu = IB_MTU_2048;
> +		break;
> +	default:
> +		/* Wildcard everything */
> +		comp_mask = 0;
> +		path->pathrec.mtu = 0;
> +		path->pathrec.mtu_selector = 0;
> +	}
> +
> +	ipoib_dbg(priv, "Start path record lookup for " IPOIB_GID_FMT " MTU > %d\n",
> +		  IPOIB_GID_ARG(path->pathrec.dgid),
> +		  comp_mask ? ib_mtu_enum_to_int(path->pathrec.mtu) : 0);
>  
>  	init_completion(&path->done);
>  
>  	path->query_id =
>  		ib_sa_path_rec_get(&ipoib_sa_client, priv->ca, priv->port,
> -				   &path->pathrec,
> +				   &path->pathrec, comp_mask    |
>  				   IB_SA_PATH_REC_DGID		|
>  				   IB_SA_PATH_REC_SGID		|
>  				   IB_SA_PATH_REC_NUMB_PATH	|




More information about the general mailing list