[Openib-windows] A problem in ib_close_al

Sun Jul 23 11:02:11 PDT 2006

Hi Fab,
Seems like I found the reason of the stuck on shutdown.
Find attached 2 patches for problems, which I come across on during
investigating of this case.
Here are short description.
1. (a bug responsible for the stuck)
	If a send MAD times out, it sends once more, so one can get 2
responds for it.
	__process_recv_resp() stores the second respond on place of they
first one (h_send->p_resp_mad)
	and the latter gets never released !
2. there are 2 more places, that *seems* like "forget" to release
response MADs - please, check me.
3. synchronization problems - we have talked them already over. I
suggest a variant of a fix.

One more question:
Is it possible for internal send in __mad_svc_send_done to have a
response on it ?
If - yes, one needs to release it.

I'll appreciate, if you can look these patches ASAP - we need them for
the release.

> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com] 
> On Behalf Of Fabian Tillier
> Sent: Wednesday, July 19, 2006 8:59 PM
> To: Leonid Keller
> Cc: openib-windows at openib.org
> Subject: Re: [Openib-windows] A problem in ib_close_al
> 
> Hi Leo,
> 
> On 7/18/06, Leonid Keller <leonid at mellanox.co.il> wrote:
> > I'm still investigating the stuck on shutdown and see a lot 
> of "strange"
> > things, look like bugs.
> > Sure, I'm missing something, but ...
> 
> No, I'm sure you're not missing something.  Things definitely 
> seem amiss here.  I'll fix the things I note below - thanks 
> for finding these issues.
> 
> > Here are the questions:
> >
> > 1. mad_list in AL object:
> >   usually protected by mad_lock, but ...
> >        sometimes - by h_al->obj.lock:
> >                - __cleanup_pool_key();
> >                - __proxy_cleanup_map();
> 
> Looking at the history of the MAD tracking, it used to rely 
> solely on the AL instance's object lock, but this resulted in 
> potential deadlocks due to locks being taken in different 
> orders in various call paths.  So the a dedicated lock was 
> introduced, but all uses of the MAD list were not fixed up properly.
> 
> >        sometimes - not protected at all:
> >                - __proxy_mad_recv_cb;
> 
> You man __proxy_put_mad?  __proxy_put_mad was initially 
> created to not take the AL object's lock for code paths in 
> the proxy where the lock was already held.
>  It is identical to ib_put_mad except for the locking around the MAD
> list.    With a dedicated MAD lock, the __proxy_put_mad function
> should be eliminated and replaced with calls to ib_put_mad.
> 
> >                - __free_mads()
> 
> __free_mads is only called from free_al - at that point, all 
> references on the AL object should be gone, and that means 
> the list is empty in a release build.  If the list is not 
> empty in a debug build, then destruction of several objects 
> (including the pool key object) timed out, and this is just 
> making a best effort to clean up.  Really, the __free_mads 
> function could be eliminated.  If anyone is accessing the MAD 
> list when __free_mads is being called, they are almost 
> certain to also access freed memory and crash the system.
> 
> >
> > 2. recv_queue in SPL_QP_SVC object:
> >   usually protected by p_spl_qp_svc->obj.lock, but is not 
> protected in:
> >        - create_spl_qp_svc() (call to spl_qp_svc_post_recvs());
> 
> This shouldn't be a big deal since the object is just being 
> created and can't be referenced for use yet.  Locking however 
> would not hurt in this code path.
> 
> >        - spl_qp_svc_reset_cb()(call to spl_qp_svc_post_recvs());
> 
> This definitely needs locking.
> 
> >        - free_spl_qp_svc();
> 
> At this point, the structure is about to be freed, and all 
> references to it will have been released.  Locking here won't 
> help much if anyone is accessing it, since there would be a 
> race with freeing the memory, which would result in an access 
> violation.
> 
> > 3. send_queue in SPL_QP_SVC object:
> >   usually protected by p_spl_qp_svc->obj.lock, but is not 
> protected in:
> >        - spl_qp_svc_reset_cb();
> 
> This is the same as for the receive queue - it should be locked.
> 
> > 4. may be the same problems exist with other link lists, like
> >        - recv_queue in AL_MAD_QP object;
> >        - send_queue in AL_MAD_QP object;
> >        - et al
> 
> The MAD QP might be worth looking at, though there are no 
> users of it so it's not as critical.
> 
> > See, please, also my comments below.
> 
> <snip...>
> 
> > > You can look at the MAD header - the method should 
> indicate whether 
> > > it is a send or a receive.  The ib_get_mad function cannot set 
> > > whether the MAD will be used for a send or receive, as 
> that is not 
> > > an input parameter.  It cannot be an input parameter 
> anyway as it is 
> > > valid for a received MAD to be used to send a response without 
> > > putting the MAD back into the pool.
> >
> > Now, after SPL_QP_SVC has been destroyed, I see CI_CA object stuck 
> > with several IB_MAD_METHOD_GETTABLE_RESP MADs, which were 
> taken yet in 
> > create_spl_qp_svc() (I mark on the MAD, in what function it was 
> > taken).
> 
> Ok, these are likely MADs that where preposted from create_spl_qp_svc.
>  They should have been routed to some MAD service.  Can you 
> look at what kind of query the response is for?  This will 
> help figure out who initiated the query and didn't process 
> the response properly.
> 
> <snip...>
> 
> > > The tracking list is there only to track MADs - it is 
> there only to 
> > > allow debugging so that we know what instance of AL has what MADs 
> > > outstanding.  We can't reclaim these automatically - we need to 
> > > figure out why they aren't being returned to the pool.
> >
> > It's not just for debugging, if you use it in 
> __cleanup_pool_key() to 
> > release the pool_key.
> 
> You can't get to __cleanup_pool_key with a non-empty MAD list 
> except in cases where destruction timed out and the user 
> ignored the assertion and is continuing.
> 
> > > > Say, by returning an error from ib_get_mad() when 
> > > > (pool_key->h_al->obj.type == CL_DESTROYING) ?
> > > > I can prepare a patch if you don't mind.
> > >
> > > If AL is destroying so will the pool_key object, so 
> there's no need 
> > > to check AL.  Destruction is initiated on all child 
> objects as soon 
> > > as it is initiated on the parent.  The ib_get_mad 
> function already 
> > > checks the state of the pool, and fails if the pool is not in the 
> > > CL_INITIALIZED state.
> > >
> >
> > It's not quite so.
> > ib_deregister_ca() initiates destroy of CI_CA and 
> consequently CA, but 
> > not AL.
> > I saw calls to ib_get_mad() during CI_CA destroying. Her is 
> an example 
> > of debug print:
> >
> > ***** ib_get_mad: MAD taken during destroying. pool_key 
> > FFFFFADFCCFBE010, al_mad FFFFFADFCD1C8010, ib_mad FFFFFADFCD1C8038, 
> > pool_state 2, ci_ca_state 3
> 
> It's OK for operations to happen on an object in the 
> destroying state, but not the destroyed state.  So this 
> should be OK.  The CI_CA object doesn't complete destruction 
> until all references on it have been released.
> 
> - Fab
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: al_mad.patch
Type: application/octet-stream
Size: 2064 bytes
Desc: al_mad.patch
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20060723/e854569a/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: al_mad_sync.patch
Type: application/octet-stream
Size: 5252 bytes
Desc: al_mad_sync.patch
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20060723/e854569a/attachment-0001.obj>