[ofw] deadlock starting in __free_mads

Tzachi Dar tzachid at mellanox.co.il
Tue Jan 4 00:03:41 PST 2011


This is a hard question, here is our best guess :-) 

The function __free_mads is being called from free_al. Each mad increases the reference on the al code so if there are outstanding mads, free_al will not be called and so is the case with __free_mads.

The reason that we have started seeing this bug is because we have tried a workaround to the leak problems that allows objects to be released even when there are references on them. This is what made the __free_mads start running all together.

Thanks
Tzachi


> -----Original Message-----
> From: Hefty, Sean [mailto:sean.hefty at intel.com]
> Sent: Monday, January 03, 2011 6:44 PM
> To: Tzachi Dar; ofw at lists.openfabrics.org; sw_net_windows
> Subject: RE: deadlock starting in __free_mads
> 
> > I have reached a deadlock caused by the fact that the function
> __free_mads
> > takes h_al->mad_lock
> >
> > It then calls ib_put_mad which calls al_remove_mad that will try to
> take
> > the same lock.
> 
> How did this not blow up 4 years ago?




More information about the ofw mailing list