[Openib-windows] A problem in ib_close_al
Leonid Keller
leonid at mellanox.co.il
Tue Jul 18 11:08:18 PDT 2006
I'm still investigating the stuck on shutdown and see a lot of "strange"
things, look like bugs.
Sure, I'm missing something, but ...
Here are the questions:
1. mad_list in AL object:
usually protected by mad_lock, but ...
sometimes - by h_al->obj.lock:
- __cleanup_pool_key();
- __proxy_cleanup_map();
sometimes - not protected at all:
- __proxy_mad_recv_cb;
- __free_mads()
2. recv_queue in SPL_QP_SVC object:
usually protected by p_spl_qp_svc->obj.lock, but is not protected in:
- create_spl_qp_svc() (call to spl_qp_svc_post_recvs());
- spl_qp_svc_reset_cb()(call to spl_qp_svc_post_recvs());
- free_spl_qp_svc();
3. send_queue in SPL_QP_SVC object:
usually protected by p_spl_qp_svc->obj.lock, but is not protected in:
- spl_qp_svc_reset_cb();
4. may be the same problems exist with other link lists, like
- recv_queue in AL_MAD_QP object;
- send_queue in AL_MAD_QP object;
- et al
See, please, also my comments below.
> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com]
> On Behalf Of Fabian Tillier
> Sent: Wednesday, July 12, 2006 7:33 PM
> To: Leonid Keller
> Cc: openib-windows at openib.org
> Subject: Re: [Openib-windows] A problem in ib_close_al
>
> Hi Leonid,
>
> On 7/12/06, Leonid Keller <leonid at mellanox.co.il> wrote:
> >
> > > -----Original Message-----
> > > From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com] On
> > > Behalf Of Fabian Tillier
> > > Sent: Tuesday, July 11, 2006 9:18 PM
> > >
> > > Hi Leo,
> > >
> > > On 7/10/06, Leonid Keller <leonid at mellanox.co.il> wrote:
> > > > Hi Fab,
> > > > Our regression uses to run opensm and then kill it at
> some moment.
> > > > Sometimes opensm enters "zombi" state and there is no
> way to kill
> > > > it.
> > > > I've investigated that and found that it is stuck on
> infinite loop
> > > > in sync_destroy_obj while destroying opensm's instance of AL.
> > > > I saw ref_cnt = 13 and mad_list of AL object contained
> 13 records.
> > >
> > > The proxy code should free all outstanding MADs before calling
> > > ib_close_al - see the code path in al_dev_close
> (al_dev.c at 371). The
> > > call to __proxy_cleanup_map will free all MADs queued
> internally.
> > > Do you know what these MADs are from? Are they received
> MADs that
> > > were not delivered to the client? Or are they send MADs that are
> > > waiting to be sent?
> > >
> > I don't know so far how to reproduce that so:
> > Is there a way to tell received MADs from sent ones apart
> from adding
> > a mad_type field anywhere and setting it after every call to
> > ib_get_mad() ?
>
> You can look at the MAD header - the method should indicate
> whether it is a send or a receive. The ib_get_mad function
> cannot set whether the MAD will be used for a send or
> receive, as that is not an input parameter. It cannot be an
> input parameter anyway as it is valid for a received MAD to
> be used to send a response without putting the MAD back into the pool.
>
Now, after SPL_QP_SVC has been destroyed, I see CI_CA object stuck
with several IB_MAD_METHOD_GETTABLE_RESP MADs, which were taken
yet in create_spl_qp_svc() (I mark on the MAD, in what function it was
taken).
> > > > I think, the problem is in that, that mad_list is freed in
> > > > free_al()
> > > > which can't be called before ref_cnt = 0, which can become such
> > > > only after calling free_al().
> > >
> > > Yes, we have a chicken and egg issue here. The free_mads
> function
> > > is there to do internal cleanup if we proceed with
> destruction when
> > > the reference count doesn't hit zero. Note that this can only
> > > happen in a debug build (where destruction will
> eventually timeout
> > > and blow the object away).
> > >
> > > > I've prepared a patch which calls __free_mads() from AL
> destroying
> > > > function.
> > >
> > > We could call __free_mads from the destroying_al
> function, but this
> > > wouldn't necessarilly prevent MADs from being queued after that
> > > function is called, but before the AL object moves to the cleanup
> > > phase of destruction (after all references are released).
>
> I just realized that we cannot call __free_mads from the
> destroying_al function - it is never safe to reclaim a MAD
> that a user is using.
>
> > I'm working now on a stuck during shutdown after regression, which
> > (the
> > stuck) we experiences constantly. The function stuck is
> > ib_deregister_ca(), which stands in sync_destroy_obj(),
> waiting for CA
> > to release it, which is in turn hold by its PD object.
> > Today I revealed that the PD object is hold by several mads, taken
> > once from the pool_key of CI_CA. It reminds the problem, being
> > discussed here and I'll appreciate if you could suggest an
> idea, why
> > wouldn't these mads be released by __cleanup_pool_key().
>
> You can't pull MADs back from a user if that user is using
> it. The user *must* return the MADs. The MAD tracking list
> in IBAL is for debugging purposes only. If a MAD is in the
> tracking list it means that a client has pulled that MAD out
> of the pool. It is not technically safe to just put the MAD
> back in the pool, as the user may still be accessing it. If
> the user later tries to return the MAD, you will very likely
> get a system crash.
>
> > For now I have an idea that probably could solve both problems:
> > What if we forbid inserting of new MADs while the AL object
> is being
> > destroyed ?
>
> The tracking list is there only to track MADs - it is there
> only to allow debugging so that we know what instance of AL
> has what MADs outstanding. We can't reclaim these
> automatically - we need to figure out why they aren't being
> returned to the pool.
It's not just for debugging, if you use it
in __cleanup_pool_key() to release the pool_key.
>
> > Say, by returning an error from ib_get_mad() when
> > (pool_key->h_al->obj.type == CL_DESTROYING) ?
> > I can prepare a patch if you don't mind.
>
> If AL is destroying so will the pool_key object, so there's
> no need to check AL. Destruction is initiated on all child
> objects as soon as it is initiated on the parent. The
> ib_get_mad function already checks the state of the pool, and
> fails if the pool is not in the CL_INITIALIZED state.
>
It's not quite so.
ib_deregister_ca() initiates destroy of CI_CA and consequently CA, but
not AL.
I saw calls to ib_get_mad() during CI_CA destroying. Her is an example
of debug print:
***** ib_get_mad: MAD taken during destroying. pool_key
FFFFFADFCCFBE010, al_mad FFFFFADFCD1C8010, ib_mad FFFFFADFCD1C8038,
pool_state 2, ci_ca_state 3
> > > A MAD completion that happens just after the
> destroying_al function
> > > returns would be able to insert the MAD in the tracking
> list if the
> > > AL instance still had some other reference held.
> > >
> > > __free_mads must be called from free_al though we could
> call it from
> > > destroying_al too, but as I said that doesn't solve the problem.
> >
> > If we can prevent adding new MADs while AL object
> destroying, it will
> > be OK to call __free_mads from destroying_al.
>
> Calling __free_mads is only possible in debug builds, after
> you've ignored the assertion hit because destruction timed
> out. __free_mads is not a function that can be called to
> reclaim MADs while users are using them. What we need to do
> is find out who/why the MADs aren't being returned to their
> pool, and fix that.
>
> - Fab
>
More information about the ofw
mailing list