[openib-general] Re: user_mad.c: deadlock?

Roland Dreier rolandd at cisco.com
Wed Nov 9 09:13:15 PST 2005


    Michael> What about ib_umad_reg_agent error handling code in
    Michael> ib_umad_reg_agent?  That seems to still call
    Michael> ib_umad_unreg_agent from under the down_write.

I'm pretty sure that's OK, because it's not possible for any send
requests to have been posted to that agent if we hit that error path.

    Michael> And what about ib_umad_kill_port, which also does this?

That looks like a problem.  I think we need the following (the locking
should still be OK, there's no need to do all the cleanup atomically).

--- infiniband/core/user_mad.c	(revision 3989)
+++ infiniband/core/user_mad.c	(working copy)
@@ -849,6 +849,7 @@ err_cdev:
 static void ib_umad_kill_port(struct ib_umad_port *port)
 {
 	struct ib_umad_file *file;
+	struct ib_mad_agent *agent;
 	int id;
 
 	class_set_devdata(port->class_dev,    NULL);
@@ -865,19 +866,21 @@ static void ib_umad_kill_port(struct ib_
 	spin_unlock(&port_lock);
 
 	down_write(&port->mutex);
-
 	port->ib_dev = NULL;
+	up_write(&port->mutex);
 
 	list_for_each_entry(file, &port->file_list, port_list)
 		for (id = 0; id < IB_UMAD_MAX_AGENTS; ++id) {
-			if (!file->agent[id])
-				continue;
-			ib_dereg_mr(file->mr[id]);
-			ib_unregister_mad_agent(file->agent[id]);
+			down_write(&port->mutex);
+			agent = file->agent[id];
 			file->agent[id] = NULL;
-		}
+			up_write(&port->mutex);
 
-	up_write(&port->mutex);
+			if (agent) {
+				ib_unregister_mad_agent(agent);
+				ib_dereg_mr(file->mr[id]);
+			}
+		}
 
 	clear_bit(port->dev_num, dev_map);
 }



More information about the general mailing list