[ofw] [PATCH] SRP dreq deadlock

Gilad Shainer Shainer at Mellanox.com
Sun Feb 17 07:35:16 PST 2008


Usha,

Can you or Alex send what issues Alex has patches for? It could be that
others are facing the same issues and trying to solve the same problems.

Thanks,
Gilad.


> -----Original Message-----
> From: Usha Srinivasan [mailto:usha.srinivasan at qlogic.com]
> Sent: Thursday, February 14, 2008 5:07 PM
> To: Yossi Leybovich; Leonid Keller; ofw at lists.openfabrics.org
> Subject: RE: [ofw] [PATCH] SRP dreq deadlock
> 
> Hi Yossi,
> Alex Estrin here at QLogic KOP has a couple of SRP patches pending to 
> be merged into Win OF. One patch addresses this deadlock bug along 
> with QP err handling and session recovery. A second patch will add a 
> bit for flow control handling in order to withstand heavy load.
> 
> Alex will be preparing his patch and will submit them to the community

> for review hopefully next week.
> 
> Usha
> 
> -----Original Message-----
> From: ofw-bounces at lists.openfabrics.org 
> [mailto:ofw-bounces at lists.openfabrics.org] On Behalf Of Yossi 
> Leybovich
> Sent: Thursday, February 14, 2008 5:58 AM
> To: Leonid Keller; ofw at lists.openfabrics.org
> Subject: [ofw] [PATCH] SRP dreq deadlock
> 
> Leonid
> 
> While working Windows SRP with Linux SRPT (OFED) I discover that after

> removing the SRPT the windows side stop receiving MADs.
> The problem is that the SRP while handling the dreq callback of the 
> target try to reconnect and wait (in the context of the callback
> thread) to the connect operation to end.
> This is deadlock as the operation will not finish till SRP release the

> callback thread.
> 
> This patch remove the reconnect code from the dreq callback.
> 
> Thanks
> Yossi
> Index: srp_connection.c
> ===================================================================
> --- srp_connection.c	(revision 2166)
> +++ srp_connection.c	(working copy)
> @@ -285,8 +285,8 @@
>  	ib_cm_drep_t    cm_drep;
>  	ib_api_status_t status;
>  	int             i;
> -	int             retry_count = 0;
> 
> +
>  	SRP_ENTER( SRP_DBG_PNP );
> 
>  	SRP_PRINT( TRACE_LEVEL_INFORMATION, SRP_DBG_DEBUG, @@ -334,75
> +334,9 @@
>  	SRP_PRINT( TRACE_LEVEL_VERBOSE, SRP_DBG_DEBUG,
>  		("Session Object ref_cnt = %d\n",
> p_srp_session->obj.ref_cnt) );
>  	cl_obj_destroy( &p_srp_session->obj );
> -
> -	do
> -	{
> -		retry_count++;
> -
> -		SRP_PRINT( TRACE_LEVEL_INFORMATION, SRP_DBG_DEBUG,
> -			("Attempting to reconnect %s. Connection Attempt
> Count = %d.\n",
> -			 p_hba->ioc_info.profile.id_string,
> -			 retry_count) );
> -
> -		SRP_PRINT( TRACE_LEVEL_VERBOSE, SRP_DBG_DEBUG,
> -			("Creating New Session For Service Entry Index
> %d.\n",
> -			p_hba->ioc_info.profile.num_svc_entries));
> -		p_srp_session = srp_new_session(
> -			p_hba, &p_hba->p_svc_entries[i], &status );
> -		if ( p_srp_session == NULL )
> -		{
> -			status = IB_INSUFFICIENT_MEMORY;
> -			break;
> -		}
> -
> -		SRP_PRINT( TRACE_LEVEL_VERBOSE, SRP_DBG_DEBUG,
> -			("New Session For Service Entry Index %d
> Created.\n",
> -			p_hba->ioc_info.profile.num_svc_entries));
> -		SRP_PRINT( TRACE_LEVEL_VERBOSE, SRP_DBG_DEBUG,
> -			("Logging Into Session.\n"));
> -		status = srp_session_login( p_srp_session );
> -		if ( status == IB_SUCCESS )
> -		{
> -			if ( p_hba->max_sg >
> p_srp_session->connection.max_scatter_gather_entries )
> -			{
> -				p_hba->max_sg =
> p_srp_session->connection.max_scatter_gather_entries;
> -			}
> -
> -			if ( p_hba->max_srb_ext_sz >
> p_srp_session->connection.init_to_targ_iu_sz )
> -			{
> -				p_hba->max_srb_ext_sz =
> -					sizeof( srp_send_descriptor_t )
> -
> -					SRP_MAX_IU_SIZE +
> -
> p_srp_session->connection.init_to_targ_iu_sz;
> -			}
> -
> -			cl_obj_lock( &p_hba->obj );
> -			p_hba->session_list[i] = p_srp_session;
> -			cl_obj_unlock( &p_hba->obj );
> -
> -			SRP_PRINT( TRACE_LEVEL_VERBOSE, SRP_DBG_DEBUG,
> -				("Session Login Issued
> Successfully.\n"));
> -		}
> -		else
> -		{
> -			SRP_PRINT( TRACE_LEVEL_ERROR, SRP_DBG_ERROR,
> -				("Session Login Failure Status = %d.\n",
> status));
> -			SRP_PRINT( TRACE_LEVEL_VERBOSE, SRP_DBG_DEBUG,
> -				 ("Session Object ref_cnt = %d\n",
> p_srp_session->obj.ref_cnt) );
> -			cl_obj_destroy( &p_srp_session->obj );
> -		}
> -	} while ( (status != IB_SUCCESS) && (retry_count < 3) );
> -
> -	if ( status == IB_SUCCESS )
> -	{
> -		SRP_PRINT( TRACE_LEVEL_INFORMATION, SRP_DBG_DEBUG,
> -			 ("Resuming Adapter for %s.\n",
> p_hba->ioc_info.profile.id_string) );
> -		p_hba->adapter_paused = FALSE;
> -		StorPortReady( p_hba->p_ext );
> -//		StorPortNotification( BusChangeDetected, p_hba->p_ext, 0
> );
> -	}
> -
> +
>  	SRP_EXIT( SRP_DBG_PNP );
> +	return ;
>  }
> 
>  /* __srp_cm_reply_cb */
> 
> 




More information about the ofw mailing list