[Openib-windows] RE: duplicate socket deadlock in WSD

Fab Tillier ftillier at silverstorm.com
Thu Oct 27 10:45:26 PDT 2005


Hi Yossi,

> From: Yossi Leybovich [mailto:sleybo at mellanox.co.il]
> Sent: Thursday, October 27, 2005 10:10 AM
> 
> I think there is deadlock in the duplicate socket flow
> duplicate socket call wait_cq_drain with sock_info->mutex acquire
> (ibsp_duplicate.c line 313)
> and even in the busy wait loop (wait_cq_drain function ) in its does not
> release the mutex and wait to the counters to be 0.
> But in the completion function (copletion_wq) in case of flush in error the
> code try to acquire the mutex
> so the completion function will not cont. and we are in deadlock

Can you try the following patch and let me know if it resolves things?  If so,
I'll commit.

Thanks,

- Fab

Index: ulp/wsd/user/ib_cm.c
===================================================================
--- ulp/wsd/user/ib_cm.c	(revision 127)
+++ ulp/wsd/user/ib_cm.c	(working copy)
@@ -156,12 +156,14 @@
 		{
 			int ret;
 
-			wait_cq_drain( socket_info );
-
 			/* Non-blocking cancel since we're in CM callback
context */
 			ib_cm_cancel( socket_info->listen.handle, NULL );
 			socket_info->listen.handle = NULL;
+			cl_spinlock_release( &socket_info->mutex );
 
+			wait_cq_drain( socket_info );
+
+			cl_spinlock_acquire( &socket_info->mutex );
 			ret = ib_accept( socket_info, p_cm_req_rec );
 			if( ret )
 			{
Index: ulp/wsd/user/ibsp_duplicate.c
===================================================================
--- ulp/wsd/user/ibsp_duplicate.c	(revision 127)
+++ ulp/wsd/user/ibsp_duplicate.c	(working copy)
@@ -310,10 +310,10 @@
 
 	cl_spinlock_release( &socket_info->mutex );
 	ib_disconnect( socket_info, &reason );
-	cl_spinlock_acquire( &socket_info->mutex );
 
 	wait_cq_drain( socket_info );
 
+	cl_spinlock_acquire( &socket_info->mutex );
 	ib_destroy_socket( socket_info );
 
 	/* Put enough info in dup_info so that the remote socket can recreate
the connection. */
Index: ulp/wsd/user/ibsp_iblow.c
===================================================================
--- ulp/wsd/user/ibsp_iblow.c	(revision 127)
+++ ulp/wsd/user/ibsp_iblow.c	(working copy)
@@ -127,6 +127,7 @@
 			cl_spinlock_release( &socket_info->recv_lock );
 
 			cl_spinlock_release( &socket_info->mutex );
+			p_io_info->p_ov = NULL;
 			IBSP_EXIT( IBSP_DBG_IO );
 			return;
 		}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: wsd_dup.patch
Type: application/octet-stream
Size: 1575 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20051027/38c1f186/attachment.obj>


More information about the ofw mailing list