[ofw] Double tracking of AV (address vectors) by the PD and the send_mad

Tzachi Dar tzachid at mellanox.co.il
Tue Jul 5 03:15:37 PDT 2011


Please see below.

Thanks
Tzachi

> -----Original Message-----
> From: Hefty, Sean [mailto:sean.hefty at intel.com]
> Sent: Friday, July 01, 2011 7:57 PM
> To: Tzachi Dar; ofw at lists.openfabrics.org
> Subject: RE: Double tracking of AV (address vectors) by the PD and the
> send_mad
> 
> > While working on debugging a bug in the IBAL code we have came to conclusion
> > that send mads use addresses vectors. This AV are being handled by the mad
> > which will return them to the pool once it has done with them.
> 
> This sounds right.  Do you know what PD the AVs are created on?  Is it an
> internal PD associated with some global instance?
> 
The PD is a global pd that is created by IBAL. The al is also the global one.

> > However, in the case of closing of the IBAL instance the PD also gets
> closed.
> > When it is closed it goes to all it's sons (i.e. the AV) and returns them to
> > the pool. This leads to data corruptions on the AV pool.
> >
> >
> >
> > Is there something that we don't understand here, or is this a bug?
> 
> Sounds like a possible bug.  I guess we need to know if MADs be outstanding at
> the time the relevant PD is destroyed.

We believe that the correct solution is to increase the reference on the AV every time that one is using them and remove it once it has finished with them. In order not to break use applications we would like to do the change in kernel only.






More information about the ofw mailing list