[Openib-windows] races on __destrot_obj function

Yossi Leybovich sleybo at mellanox.co.il
Tue Jul 11 10:09:55 PDT 2006


 

> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com] 
> On Behalf Of Fabian Tillier
> Sent: Tuesday, July 11, 2006 7:29 PM
> To: Yossi Leybovich
> Cc: openib-windows at openib.org
> Subject: Re: [Openib-windows] races on __destrot_obj function
> 
> Hi Yossi,
> 
> On 7/11/06, Yossi Leybovich <sleybo at mellanox.co.il> wrote:
> > Attached is patch that solve this problem
> >
> > It includes:
> >
> > 1. fix to cl_obj to return value if the parent port is not in 
> > CL_INITIALIZED 2. fix to insert/insert_locked function to 
> check return 
> > value 3. add ref count mechanism to help debug ref count 
> problems 4. 
> > not call port_resume in endpt cleanup but in port destroy
> >
> > I tested this patch with our test and it pass
> >
> > Pls review
> 
> Comments about the patch below.  It would be good IMO to have 
> this split up, similar to how large patches are done in Linux 
> - so each of the above enumerations (1-4) would be a sparate 
> patch.  It just makes it a bit easier to review/apply, so 
> that it's not all-or-nothing.
> 

I can separate the complib patch from the IPoIB patch .(part 1)
But it would take a lot of time to remove all the debug utils I added
from the code to create one patch then reinsert it and remove the former
changes and create another patch.

You right there few typos in the patch 
1. IPOIB_DBG_OBJ sould be 1<<13
2. there is one place where I had extra braces
3. ipoib_port_ref/deref should use atomic functions

Are you going to fix and apply or you want me to do this

Yossi 

Comments .... below

> - Fab
> 
> > Singed-off-by: Yossi Leybovich (sleybo at mellanox.co.il)
..snip
> > Index: ulp/ipoib/kernel/ipoib_adapter.c 
> > ===================================================================
> > --- ulp/ipoib/kernel/ipoib_adapter.c    (revision 1528)
> > +++ ulp/ipoib/kernel/ipoib_adapter.c    (working copy)
> > @@ -717,7 +717,7 @@
> >        if( p_adapter->state == IB_PNP_PORT_ACTIVE )
> >        {
> >                p_port = p_adapter->p_port;
> > -               cl_obj_ref( &p_port->obj );
> > +               ipoib_port_ref(p_port, 1);
> 
> Just a nit - there should be a space after the '(' and before the ')'
> in all the ipoib_port_ref/deref calls similar to how the 
> cl_obj_ref/deref calls where formatted.
> 
> Is there a place that controls what numbers are taken/free?  
> Perhaps using an enum here would allow us to use not just 
> numbers, but names that would make more sense (like the name 
> of the function?)  That way we would have a single place 
> where indexes would be used, and the array would also have an 
> upper bound.  Something similar to how the cl_perf counters 
> are defined.  Just a thought - seems that indexes would be 
> hard to keep track of.
> 

I can add that if you like
I made something quick and derty in order to find bug, we can make it
pretier.

> >
> > @@ -111,6 +112,7 @@
> >  #define IPOIB_DBG_OID  (1 << 10)
> >  #define IPOIB_DBG_IOCTL        (1 << 11)
> >  #define IPOIB_DBG_STAT (1 << 12)
> > +#define IPOIB_DBG_OBJ  (1<<31)
> 
> This is the same as CL_DBG_ERROR.  We can't use the most 
> significant bit or least significant bit, as both indicate 
> error (MSb for CL_PRINT, LSb for WPP).  I don't understand 
> what this accomplishes.

You are right its typo



> > @@ -457,7 +457,25 @@
> >        return cl_memcmp( p_key1, p_key2, sizeof(ib_gid_t) );  }
> >
> > +inline void ipoib_port_ref(ipoib_port_t * p_port, int type) {
> > +       cl_obj_ref( &p_port->obj );
> > +#if DBG
> > +       p_port->ref[type%100] --;
> 
> Why is this a decrement?  Shouldn't it be an increment?  
> Should it be atomic?

10x it should be atomic

 




More information about the ofw mailing list