[openib-general] [PATCHv2] osm: OSM crash TRIVIAL bug fix

Yevgeny Kliteynik kliteyn at mellanox.co.il
Thu Aug 17 06:28:49 PDT 2006


Hi Hal.

> This line wrapped so there is something wrong with your mailer.

I'm using a different mailer now, so I hope that it's ok now.
  
> > +  m->v = NULL; /* just make sure we do not point tofree'd madw */
> 
> Also, is this line really needed (and if so why) ? I know you did say 
> "it cleans up old pointers to retrieved madw" but this shouldn't be 
> accessed, right ?

You're right, it shouldn't be accessed.
But generally, it's a good practice to assign a null to 
any pointer that points to a freed memory, and should not
be in use any more.

> Also, if this is added here, there are other places where the same
> thing should be done ?

I just examined this area of code, so this is what I saw.

-- 
  
Regards,
 
Yevgeny Kliteynik

Mellanox Technologies LTD
Tel: +972-4-909-7200 ext: 394
Fax: +972-4-959-3245
P.O. Box 586 Yokneam 20692 ISRAEL 

On Mon, 2006-08-14 at 16:05 +0300, Hal Rosenstock wrote:
> Hi Yevgeny,
> 
> On Mon, 2006-08-14 at 04:13, Yevgeny Kliteynik wrote: 
> > Hi Hal. 
> >  
> > This patch fixes an OSM crash when working with Cisco's stack. 
> > Cisco's doesn't follow the same TID convention when generating 
> > transaction id which in some bad flow revealed this bug in the 
> > get_madw lookup. 
> > The bug is in get_madw which does not cleanup old pointers to  
> > retrieved madw and also does not detect lookup of its reserved
> "free" 
> > entry of key==0.  
> >  
> > (This better text replaces my previous patch: 
> >  "OSM crash when working with Cisco's TopSpin stack") 
> >  
> > Yevgeny
> 
> Thanks. Good find.
> 
> > Signed-off-by:  Yevgeny Kliteynik < kliteyn at mellanox.co.il> 
> >  
> >  
> > Index: osm/libvendor/osm_vendor_ibumad.c 
> > =================================================================== 
> > --- osm/libvendor/osm_vendor_ibumad.c   (revision 8614) 
> > +++ osm/libvendor/osm_vendor_ibumad.c   (working copy) 
> > @@ -141,12 +141,20 @@ get_madw(osm_vendor_t *p_vend, ib_net64_ 
> >         ib_net64_t mtid = (*tid &
> cl_ntoh64(0x00000000ffffffffllu)); 
> >         osm_madw_t *res; 
> >  
> > +    /* 
> > +     * Since mtid == 0 is the empty key we should not 
> > +     * waste time looking for it 
> > +     */ 
> > +    if (mtid == 0) 
> > +       return 0; 
> > + 
> >         cl_spinlock_acquire( &p_vend->match_tbl_lock );  
> >         for (m = p_vend->mtbl.tbl, e = m + p_vend->mtbl.max; m < e; 
> > m++) { 
> >                 if (m->tid == mtid) { 
> >                         m->tid = 0; 
> >                         *tid = mtid; 
> >                         res = m->v; 
> > +                       m->v = NULL; /* just make sure we do not
> point 
> > to free'd madw */
> 
> This line wrapped so there is something wrong with your mailer.
> 
> Also, is this line really needed (and if so why) ? I know you did say 
> "it cleans up old pointers to retrieved madw" but this shouldn't be 
> accessed, right ? Also, if this is added here, there are other places 
> where the same thing should be done ?
> 
> >
> cl_spinlock_release( &p_vend->match_tbl_lock 
> > ); 
> >                         return res; 
> >                 } 
> > 
> 
> Applied to trunk and 1.1 with the exception noted above.
> 
> -- Hal
> 






More information about the general mailing list