[openib-general] user_mad.c: deadlock?

Michael S. Tsirkin mst at mellanox.co.il
Mon Nov 7 08:14:52 PST 2005


Hi, Roland!
This is not directly related to the last oops report
that I sent.

I noticed that unregister_mad_agent in mad.c flushes the port wq.
This has the potential to block until a work is finished.

However, one of the things done on the work queue is
calling handlers for existing agents.

Looking at user_mad.c, ib_umad_close calls ib_unregister_mad_agent with
port mutex taken, while send_handler calls queue_packet which
in turn takes the port mutex.

It seems, therefore, that we can have a deadlock inside user_mad,
where ib_umad_close calls ib_unregister_mad_agent which blocks
until send_handler runs which is blocked by the port mutex.

A possible solution would be to move ib_unregister_mad_agent outside
the code section protected by the mutex.
Does this make sense?

-- 
MST



More information about the general mailing list