[ofw] [IPoIB_NDIS6_CM] [Patch 3/3] Improper IPoIB behavior during simultaneous sends on both sides

Alex Naslednikov xalex at mellanox.co.il
Thu Nov 11 03:51:53 PST 2010


Fab,
Thank you for the comments.
Please, SB

-----Original Message-----
From: Fab Tillier [mailto:ftillier at microsoft.com] 
Sent: Wednesday, November 10, 2010 7:20 PM
To: Alex Naslednikov; Hefty, Sean; ofw at lists.openfabrics.org
Subject: RE: [ofw] [IPoIB_NDIS6_CM] [Patch 3/3] Improper IPoIB behavior during simultaneous sends on both sides

Alex Naslednikov wrote on Wed, 10 Nov 2010 at 05:40:48

> Please, find the updated patch
> In addition to the community comments, it contains also additional
> verification code
> and new variable on the port structure to store maximum number of SG
> elements
> 
> Index: ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp
> ===================================================================
> --- ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp	(revision 2986)
> +++ ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp	(working copy)
> @@ -1110,10 +1110,23 @@
>  	qp_create.rq_sge = 2;	/* To support buffers spanning pages. */
>  	qp_create.h_rq_cq = p_port->ib_mgr.h_recv_cq;
>  	qp_create.sq_depth = p_port->p_adapter->params.sq_depth;
> -
> +
> +	// p_ca_attrs->max_sges contains the  maximum number of SG
> elements
> +	// available by HW. 3 of the them are reserved for an internal
> use

Internal use for UD transfer?  It would be helpful to document what they are used for.
[XaleX] Maybe it would be wise to refer to spec document, but it should be done as a separate patch.
If this comment is not acceptable, I can remove it

> +	// Thus, the maximum of SGE for UD QP is limited by (p_ca_attrs-
> >max_sges - 3)
>  #define UD_QP_USED_SGE 3
> -	qp_create.sq_sge = MAX_SEND_SGE < p_port->p_ca_attrs->max_sges ?
> -			MAX_SEND_SGE  : ( p_port->p_ca_attrs->max_sges -
> UD_QP_USED_SGE );
> +	if ( p_port->p_ca_attrs->max_sges > MAX_SEND_SGE )

Why not check against MAX_SEND_SGE + UD_QP_USED_SGE?  Or is MAX_SEND_SGE inclusive of the reserved SGE entries?
[XaleX] The second is correct

> +	{
> +		p_port->max_sq_sge_supported = MAX_SEND_SGE -
> UD_QP_USED_SGE;
> +	}
> +	else
> +	{
> +		p_port->max_sq_sge_supported = p_port->p_ca_attrs->max_sges
> - UD_QP_USED_SGE;

If you make the change above, you will be able to remove the '- UD_QP_USED_SGE' here.
[XaleX] I didn't

> +	}
> +
> +	p_port->p_ca_attrs->max_sges -= UD_QP_USED_SGE;
> +	qp_create.sq_sge = p_port->max_sq_sge_supported;
> +
>  	if ( !p_port->p_ca_attrs->ipoib_csum )
>  	{
>  		/* checksum is not supported by device
> @@ -1162,6 +1175,11 @@
>  			p_port->p_adapter->p_ifc->get_err_str( status )) );
>  		return status;
>  	}
> +	if ( qp_attr.sq_sge < qp_create.sq_sge)

The HCA driver is not allowed to reduce the number of SGEs requested - it must fail the request if the number of SGEs requested exceeds its capabilities.  You can replace the if with ASSERT( qp_attr.sq_sge >= qp_create.sq_sge ) and that should be good enough.  This is not the place to validate that the HCA driver behaves properly.
[XaleX] This is correct, so I would add the following: 
ASSERT ( qp_attr.sq_sge >= qp_create.sq_sge );

> +	{
> +		ASSERT ( FALSE );

Any assertions should have meaningful expressions.  Seeing in the debugger "Assertion failed: FALSE" is not helpful.

> +		p_port->max_sq_sge_supported = qp_attr.sq_sge;
> +	}
>  	p_port->ib_mgr.qpn = qp_attr.num;
> 
>  	/* Register all of physical memory */
> @@ -3799,7 +3817,7 @@
>  	}
> 
>  	/* Remember that one of the DS entries is reserved for the IPoIB
> header. */
> -	if( num_pages >= MAX_SEND_SGE )
> +	if( num_pages >=  p_port->max_sq_sge_supported )
>  	{
>  		IPOIB_PRINT(TRACE_LEVEL_INFORMATION, IPOIB_DBG_SEND,
>  			("Too many buffers to fit in WR ds_array.  Copying
> data.\n") );
> @@ -4270,13 +4288,14 @@
>  	   Copy only N+1-MAX_SEND_SGE, where N is a lenght of SG List
>  	   Remember that one of the DS entries is reserved for the IPoIB
> header.
>  	*/
> -	if( ( p_sgl->NumberOfElements >= MAX_SEND_SGE ||
> +	if( ( p_sgl->NumberOfElements >=  s_buf->p_port->max_sq_sge_supported ||
>  		p_sgl->Elements[0].Length < sizeof(eth_hdr_t)) )
>  	{
>  		IPOIB_PRINT( TRACE_LEVEL_WARNING, IPOIB_DBG_SEND,
>  			("Too many buffers %d to fit in WR ds_array[%d] \
>  			 Or buffer[0] length %d < Eth header. Copying
> data.\n",
> -			p_sgl->NumberOfElements, MAX_SEND_SGE, p_sgl-
> >Elements[0].Length ) );
> +			p_sgl->NumberOfElements,  s_buf->p_port-
> >max_sq_sge_supported ,
> +			p_sgl->Elements[0].Length ) );
> 
>  		if( !s_buf->p_port->p_adapter->params.cm_enabled )
>  		{
> Index: ulp/ipoib_NDIS6_CM/kernel/ipoib_port.h
> ===================================================================
> --- ulp/ipoib_NDIS6_CM/kernel/ipoib_port.h	(revision 2982)
> +++ ulp/ipoib_NDIS6_CM/kernel/ipoib_port.h	(working copy)
> @@ -590,6 +590,7 @@
> 
>  	struct _ipoib_adapter	*p_adapter;
>  	uint8_t					port_num;
> +	uint32_t				max_sq_sge_supported;
> 
>  	KEVENT					sa_event;
> 
> @@ -653,6 +654,9 @@
>  *	port_num
>  *		Port number of this adapter.
>  *
> +*	max_sq_sge_supported
> +*		Maximum number of send QP SG elements supported by HW

This isn't quite right - the HW could support more than this, as the maximum here is MAX_SEND_SGE, no?
It's the maximum number of SGEs that will be used on this instance.
[XaleX] Please, note that MAX_SEND_SGE is an upper limit but not a lower limit, cause if query_device_attr will return the lower number, max_sq_sge_supported will be less than MAX_SEND_SGE.
Thus, the description may change to:
"The actual number of SGEs that will be used for UD QP"


> +*
>  *	ib_mgr
>  *		IB resource manager.
>  *
> 
> -----Original Message-----
> From: Hefty, Sean [mailto:sean.hefty at intel.com]
> Sent: Tuesday, November 09, 2010 8:33 PM
> To: Fab Tillier; Alex Naslednikov; ofw at lists.openfabrics.org
> Subject: RE: [ofw] [IPoIB_NDIS6_CM] [Patch 3/3] Improper IPoIB behavior
> during simultaneous sends on both sides
> 
> > > -	qp_create.sq_sge = MAX_SEND_SGE < p_port->p_ca_attrs->max_sges ?
> > > -			MAX_SEND_SGE  : ( p_port->p_ca_attrs->max_sges -
> > > UD_QP_USED_SGE );
> > > +	if ( MAX_SEND_SGE < p_port->p_ca_attrs->max_sges )
> >
> > I find this notation very hard to read: "CONSTANT <operator>
> variable".  I
> > *much* prefer variable <operator> CONSTANT.  The argument that it
> prevents
> > errors is weak, at best, since compilers will generate warnings for
> > assignment within conditionals.  When looking at this code, I have to
> think
> > "if p_port->p_ca_attrs->max_sges > MAX_SEND_SGE" to be able to reason
> the
> > code flow.
> 
> Agree with this I do.  Backwards makes the logic it does.



More information about the ofw mailing list