[openib-general] Re: user_mad.c: deadlock?

Michael S. Tsirkin mst at mellanox.co.il
Wed Nov 9 09:50:51 PST 2005


Quoting r. Roland Dreier <rolandd at cisco.com>:
> Subject: Re: user_mad.c: deadlock?
> 
>     Michael> What about ib_umad_reg_agent error handling code in
>     Michael> ib_umad_reg_agent?  That seems to still call
>     Michael> ib_umad_unreg_agent from under the down_write.
> 
> I'm pretty sure that's OK, because it's not possible for any send
> requests to have been posted to that agent if we hit that error path.
> 
>     Michael> And what about ib_umad_kill_port, which also does this?
> 
> That looks like a problem.  I think we need the following (the locking
> should still be OK, there's no need to do all the cleanup atomically).
> 
> --- infiniband/core/user_mad.c	(revision 3989)
> +++ infiniband/core/user_mad.c	(working copy)
> @@ -849,6 +849,7 @@ err_cdev:
>  static void ib_umad_kill_port(struct ib_umad_port *port)
>  {
>  	struct ib_umad_file *file;
> +	struct ib_mad_agent *agent;
>  	int id;
>  
>  	class_set_devdata(port->class_dev,    NULL);
> @@ -865,19 +866,21 @@ static void ib_umad_kill_port(struct ib_
>  	spin_unlock(&port_lock);
>  
>  	down_write(&port->mutex);
> -
>  	port->ib_dev = NULL;
> +	up_write(&port->mutex);
>  
>  	list_for_each_entry(file, &port->file_list, port_list)
>  		for (id = 0; id < IB_UMAD_MAX_AGENTS; ++id) {
> -			if (!file->agent[id])
> -				continue;
> -			ib_dereg_mr(file->mr[id]);
> -			ib_unregister_mad_agent(file->agent[id]);
> +			down_write(&port->mutex);
> +			agent = file->agent[id];
>  			file->agent[id] = NULL;
> -		}
> +			up_write(&port->mutex);
>  
> -	up_write(&port->mutex);
> +			if (agent) {
> +				ib_unregister_mad_agent(agent);
> +				ib_dereg_mr(file->mr[id]);
> +			}
> +		}
>  
>  	clear_bit(port->dev_num, dev_map);
>  }
> 

I'm not convinced.
What would prevent ib_umad_close from touching the list if we release the mutex?

-- 
MST



More information about the general mailing list