[Openib-windows] WSD: Behavior when the other side has not yet called accept, is different from TCP/IP (Ethernet)

Tzachi Dar tzachid at mellanox.co.il
Mon Aug 7 12:09:42 PDT 2006


Hi Fab,

If we want to be as close as possible to the Ethernet than if no one is
listening we should return WSAECONNREFUSED. This is also the case if the
backlog is exceeded. There is one limitation to this, that I have tried
to describe in the bag, and I'll try to explain it now. On Ethernet in
both cases, there is a retry (3 times). That is  practically, for a
backlog of 5, one can start 10 clients simultaneously, and they will all
succeed. In the current implementation of WSD, this will fail. More than
that due to the nature of the limited number of threads, in the case
that the server is not standing the load, the clients wait for about a
second, thus giving the server time to act correctly.

By the way, here is something that I just thought off right now: If we
will change our behavior to simply drop REQ packets that the backlog for
has exceeded, we might be different from the spec, but we should handle
the stress situations better.
(this is very small change in the code)
What do you think?

Thanks
Tzachi

> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com] 
> On Behalf Of Fabian Tillier
> Sent: Monday, August 07, 2006 7:48 PM
> To: Tzachi Dar
> Cc: openib-windows at openib.org
> Subject: Re: [Openib-windows] WSD: Behavior when the other 
> side has not yet called accept, is different from TCP/IP (Ethernet)
> 
> Hi Tzachi,
> 
> On 8/6/06, Tzachi Dar <tzachid at mellanox.co.il> wrote:
> >
> > http://openib.org/bugzilla/show_bug.cgi?id=189
> >
> > Hi Fab,
> >
> > We have noticed this miss-behavior of WSD while doing some 
> tests on IPERF.
> >
> > This is probably not something to handle before WHQL, but we will 
> > probably have to fix it some day.
> > (We might even consider doing this as part of the CM behavior).
> 
> What should the error be on the connecting side?  The docs 
> say that when the backlog is exceeded, the provider should 
> fail the connection with WSAECONNREFUSED.  Perhaps the docs 
> are wrong and we should be returning  WSAECONNRESET?  Should 
> the results be the same as what happens if there's nobody listening?
> 
> Right now, if nobody is listening, WSD will return 
> WSACONNREFUSED - I think this should really be WSAECONNRESET.
> 
> Anyhow, here's a patch that will return WSAECONNRESET if 
> there is no one listening or if the backlog is exceeded.  Let 
> me know if this corrects the behavior your are seeing.
> 
> - Fab
> 
> Index: ulp/wsd/user/ib_cm.c
> ===================================================================
> --- ulp/wsd/user/ib_cm.c	(revision 440)
> +++ ulp/wsd/user/ib_cm.c	(working copy)
> @@ -127,7 +127,7 @@
>  			/* Already too many connection requests 
> are queued */
>  			IBSP_TRACE1( IBSP_DBG_CM,
>  				("already too many incoming 
> connections, rejecting\n") );
> -			ib_reject( p_cm_req_rec->h_cm_req, 
> IB_REJ_USER_DEFINED );
> +			ib_reject( p_cm_req_rec->h_cm_req, 
> IB_REJ_INVALID_SID );
>  			break;
>  		}
> 
> @@ -433,10 +433,19 @@
>  		ibsp_conn_remove( socket_info );
> 
>  		IBSP_CHANGE_SOCKET_STATE( socket_info, IBSP_BIND );
> -		if( p_cm_rej_rec->rej_status == IB_REJ_TIMEOUT )
> +		switch( p_cm_rej_rec->rej_status )
> +		{
> +		case IB_REJ_TIMEOUT:
>  			ibsp_post_select_event( socket_info, 
> FD_CONNECT, WSAETIMEDOUT );
> -		else
> +			break;
> +
> +		case IB_REJ_INVALID_SID:
> +			ibsp_post_select_event( socket_info, 
> FD_CONNECT, WSAECONNRESET );
> +			break;
> +			
> +		default:
>  			ibsp_post_select_event( socket_info, 
> FD_CONNECT, WSAECONNREFUSED );
> +		}
>  		break;
> 
>  	case IBSP_CONNECTED:
> 




More information about the ofw mailing list