[ofw] patch[ipoib]: fix a deadlock

Uri Habusha urih at mellanox.co.il
Mon Feb 14 07:16:11 PST 2011


This checkin comes to fix a deadlock that we reach.
- the first thread gets the port lock, increments the endpt_rdr and releases the lock.
- then the second thread gets the lock and wait that endpt_rdr will be 0.
- the first thread tries to get the object lock and stuck. as a result it can't decrement the endpt_rdr.

Attached is the stack call

Index: ipoib_port.cpp
===================================================================
--- ipoib_port.cpp            (revision 3095)
+++ ipoib_port.cpp         (working copy)
@@ -7205,7 +7205,10 @@

                /* Wait for all readers to complete. */
                while( p_port->endpt_rdr )
-                              ;
+             {
+                             cl_obj_unlock( &p_port->obj );
+                             cl_obj_lock( &p_port->obj );
+             }
                /*
                 * We don't need to initiate destruction - this is called only
                 * from the __port_destroying function, and destruction cascades
@@ -7240,7 +7243,10 @@
                cl_obj_lock( &p_port->obj );
                /* Wait for all readers to complete. */
                while( p_port->endpt_rdr )
-                              ;
+             {
+                             cl_obj_unlock( &p_port->obj );
+                             cl_obj_lock( &p_port->obj );
+             }

 #if 0
                __endpt_mgr_remove_all(p_port);
@@ -7410,7 +7416,10 @@
                cl_obj_lock( &p_port->obj );
                /* Wait for all readers to complete. */
                while( p_port->endpt_rdr > 1 )
-                              ;
+             {
+                             cl_obj_unlock( &p_port->obj );
+                             cl_obj_lock( &p_port->obj );
+             }

                /* Remove the endpoint from the maps so further requests don't find it. */
                cl_qmap_remove_item( &p_port->endpt_mgr.mac_endpts, &p_endpt->mac_item );
@@ -7869,7 +7878,10 @@

                /* Wait for all readers to finish */
                while( p_port->endpt_rdr )
-                              ;
+             {
+                             cl_obj_unlock( &p_port->obj );
+                             cl_obj_lock( &p_port->obj );
+             }
                p_item = cl_qmap_remove( &p_port->endpt_mgr.mac_endpts, key );
                /*
                 * Dereference the endpoint.  If the ref count goes to zero, it
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20110214/0a4a763f/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ipoib_deadlock.txt
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20110214/0a4a763f/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ipoib_deadlock.patch
Type: application/octet-stream
Size: 1445 bytes
Desc: ipoib_deadlock.patch
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20110214/0a4a763f/attachment.obj>


More information about the ofw mailing list