<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Message</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.3243" name=GENERATOR></HEAD>
<BODY>
<DIV>
<DIV><SPAN class=750413222-29122007><FONT face=Arial><FONT color=#0000ff><FONT
size=2><SPAN class=058395311-14022008>> </SPAN>Has anyone had experience with
IPoIB dying when an SRP initiator is not manually ejected before the target is
shutdown.</FONT></FONT></FONT></SPAN></DIV>
<DIV><SPAN class=750413222-29122007><FONT face=Arial><FONT color=#0000ff><FONT
size=2><SPAN class=058395311-14022008>> </SPAN>How is IPoIB related to SRP
Initiator? Why would one crash the other?</FONT></FONT></FONT></SPAN></DIV>
<DIV><SPAN class=750413222-29122007><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=750413222-29122007><SPAN class=058395311-14022008><FONT
face=Arial color=#0000ff size=2>Here is a possible answer and a patch from
Yossi Leibovich.</FONT></SPAN></SPAN></DIV>
<DIV><SPAN class=750413222-29122007><SPAN class=058395311-14022008><FONT
face=Arial color=#0000ff size=2></FONT></SPAN></SPAN> </DIV></DIV>
<DIV><FONT face=Arial color=#0000ff size=2>While working Windows SRP with Linux
SRPT (OFED) I discover that after removing the SRPT the windows side stop
receiving MADs.<BR>The problem is that the SRP<SPAN
class=058395311-14022008>,</SPAN> while handling the <SPAN
class=058395311-14022008>DREQ </SPAN>callback of the target<SPAN
class=058395311-14022008>,</SPAN> tr<SPAN class=058395311-14022008>ies</SPAN> to
reconnect and wait<SPAN class=058395311-14022008>s indefinitly</SPAN> (in
the context of the callback thread) <SPAN class=058395311-14022008>for the
end of the </SPAN>connect operation.<SPAN class=058395311-14022008> The latter
can happen only when SRP will get or a timeout or a REP MAD, which can't
be gotten before returning from the DREQ callback.</SPAN><BR>This
is <SPAN class=058395311-14022008>a </SPAN>deadlock<SPAN
class=058395311-14022008>, which prevents any normal MAD handling and can
possibly explain IPoIB</SPAN> <SPAN class=058395311-14022008>incorrect
behaviour.</SPAN>.</FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2>This patch remove<SPAN
class=058395311-14022008>s</SPAN> the reconnect code from the <SPAN
class=058395311-14022008>DREQ </SPAN>callback.</FONT></DIV>
<DIV><SPAN class=058395311-14022008><FONT face=Arial color=#0000ff size=2>Any
comments are welcome.</FONT></SPAN></DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2>Index:
srp_connection.c<BR>===================================================================<BR>---
srp_connection.c (revision 2166)<BR>+++ srp_connection.c (working
copy)<BR>@@ -285,8 +285,8 @@<BR> ib_cm_drep_t
cm_drep;<BR> ib_api_status_t
status;<BR> int
i;<BR>- int
retry_count = 0;<BR> <BR>+<BR> SRP_ENTER( SRP_DBG_PNP
);<BR> <BR> SRP_PRINT( TRACE_LEVEL_INFORMATION, SRP_DBG_DEBUG,
@@ -334,75 +334,9 @@<BR> SRP_PRINT( TRACE_LEVEL_VERBOSE,
SRP_DBG_DEBUG,<BR> ("Session Object ref_cnt = %d\n",
p_srp_session->obj.ref_cnt) );<BR> cl_obj_destroy(
&p_srp_session->obj
);<BR>-<BR>- do<BR>- {<BR>- retry_count++;<BR>-<BR>- SRP_PRINT(
TRACE_LEVEL_INFORMATION, SRP_DBG_DEBUG,<BR>- ("Attempting to
reconnect %s. Connection Attempt Count = %d.\n",<BR>-
p_hba->ioc_info.profile.id_string,<BR>- retry_count)
);<BR>-<BR>- SRP_PRINT( TRACE_LEVEL_VERBOSE,
SRP_DBG_DEBUG,<BR>- ("Creating New Session For Service Entry
Index
%d.\n",<BR>- p_hba->ioc_info.profile.num_svc_entries));<BR>- p_srp_session
= srp_new_session(<BR>- p_hba, &p_hba->p_svc_entries[i],
&status );<BR>- if ( p_srp_session == NULL
)<BR>- {<BR>- status =
IB_INSUFFICIENT_MEMORY;<BR>- break;<BR>- }<BR>-<BR>- SRP_PRINT(
TRACE_LEVEL_VERBOSE, SRP_DBG_DEBUG,<BR>- ("New Session For
Service Entry Index %d
Created.\n",<BR>- p_hba->ioc_info.profile.num_svc_entries));<BR>- SRP_PRINT(
TRACE_LEVEL_VERBOSE, SRP_DBG_DEBUG,<BR>- ("Logging Into
Session.\n"));<BR>- status = srp_session_login( p_srp_session
);<BR>- if ( status == IB_SUCCESS
)<BR>- {<BR>- if ( p_hba->max_sg >
p_srp_session->connection.max_scatter_gather_entries
)<BR>- {<BR>- p_hba->max_sg =
p_srp_session->connection.max_scatter_gather_entries;<BR>- }<BR>-<BR>- if
( p_hba->max_srb_ext_sz > p_srp_session->connection.init_to_targ_iu_sz
)<BR>- {<BR>- p_hba->max_srb_ext_sz
=<BR>- sizeof( srp_send_descriptor_t )
-<BR>- SRP_MAX_IU_SIZE
+<BR>- p_srp_session->connection.init_to_targ_iu_sz;<BR>- }<BR>-<BR>- cl_obj_lock(
&p_hba->obj );<BR>- p_hba->session_list[i] =
p_srp_session;<BR>- cl_obj_unlock( &p_hba->obj
);<BR>-<BR>- SRP_PRINT( TRACE_LEVEL_VERBOSE,
SRP_DBG_DEBUG,<BR>- ("Session Login Issued
Successfully.\n"));<BR>- }<BR>- else<BR>- {<BR>- SRP_PRINT(
TRACE_LEVEL_ERROR, SRP_DBG_ERROR,<BR>- ("Session Login
Failure Status = %d.\n", status));<BR>- SRP_PRINT(
TRACE_LEVEL_VERBOSE, SRP_DBG_DEBUG,<BR>- ("Session
Object ref_cnt = %d\n", p_srp_session->obj.ref_cnt)
);<BR>- cl_obj_destroy( &p_srp_session->obj
);<BR>- }<BR>- } while ( (status != IB_SUCCESS) &&
(retry_count < 3) );<BR>-<BR>- if ( status == IB_SUCCESS
)<BR>- {<BR>- SRP_PRINT( TRACE_LEVEL_INFORMATION,
SRP_DBG_DEBUG,<BR>- ("Resuming Adapter for %s.\n",
p_hba->ioc_info.profile.id_string)
);<BR>- p_hba->adapter_paused =
FALSE;<BR>- StorPortReady( p_hba->p_ext
);<BR>-// StorPortNotification( BusChangeDetected, p_hba->p_ext, 0
);<BR>- }<BR>-<BR>+ <BR> SRP_EXIT( SRP_DBG_PNP
);<BR>+ return ;<BR> }<BR> <BR> /* __srp_cm_reply_cb
*/</FONT></DIV>
<DIV> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV><BR>
<BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> ofw-bounces@lists.openfabrics.org
[mailto:ofw-bounces@lists.openfabrics.org] <B>On Behalf Of </B>Sufficool,
Stanley<BR><B>Sent:</B> Sunday, December 30, 2007 12:37 AM<BR><B>To:</B>
ofw@lists.openfabrics.org<BR><B>Subject:</B> [ofw] IPoIB crashes with SRP bad
shutdown?<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV><SPAN class=750413222-29122007><FONT face=Arial size=2>Has anyone had
experience with IPoIB dying when an SRP initiator is not manually ejected
before the target is shutdown.</FONT></SPAN></DIV>
<DIV><SPAN class=750413222-29122007><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=750413222-29122007><FONT face=Arial size=2>I seem to be able
to reproduce this condition regularly with the OFED SRP Target and WinOF SRP
initiator. </FONT></SPAN></DIV>
<DIV><SPAN class=750413222-29122007><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=750413222-29122007><FONT face=Arial size=2>How is IPoIB
related to SRP Initiator? Why would one crash the
other?</FONT></SPAN></DIV></BLOCKQUOTE></BODY></HTML>