[openib-general] Re: user_mad.c: deadlock?

Michael S. Tsirkin mst at mellanox.co.il
Wed Nov 9 13:48:20 PST 2005


Quoting r. Roland Dreier <rolandd at cisco.com>:
> Subject: Re: user_mad.c: deadlock?
> 
>     Roland> And also we need ib_umad_kill_port() to wait for any
>     Roland> in-progress ib_umad_close() calls, since we don't want to
>     Roland> call ib_unregister_mad_agent() after we've returned from
>     Roland> the device removal call.
> 
>     Michael> This should work fine too since the last down_write that
>     Michael> detects that list list is empty will flush these guys
>     Michael> out.
> 
> The problem I run into trying to implement this is that both
> ib_umad_close() and ib_umad_kill_port() need to do something like:
> 
> 	down_write(&port->mutex);
> 	agent = file->agent[id];
> 	file->agent[id] = NULL;
> 	up_write(&port->mutex);
> 
> 	if (agent)
> 		ib_unregister_mad_agent(agent);
> 
> but ib_umad_close() could pause arbitrarily long right before the
> ib_unregister_mad_agent() call and then end up calling the function
> after the device is already gone.
> 
>  - R.
> 

I think I see a solution: replace up_write with downgrade_write.
This way ib_umad_close has a read lock most of the time, and
write lock only while it is changing the list.

-- 
MST



More information about the general mailing list