[ofw] [PATCH 3/7] ib/cm: export CM only interface

Wed Jan 21 11:34:59 PST 2009

>> The refcount on the QP cannot hit 0 in this call.  There are already
>> multiple references being held on the QP at this point.  There has to be
>> at least 1 or we couldn't safely dereference the pointer to start with.
>
>Sure, but then your references are interdependent, when it's a lot easier to
>maintain if they are self-standing.

It's a lot easier to maintain if the reference counting were easier to follow.
If I can dereference a pointer, then I must have a reference.  When a second
reference is taken doesn't matter as long as it's done before the first is
released.

Beyond the reference counting of CEPs, which I still don't follow, there's code
like destroy QP that actually increments the reference count, which is
completely non-intuitive.

>The blocking should be done in the IOCTL handler, not at the lowest level.
>That's my objection.  I have the same objection to all the blocking verbs, I
>think the design is crap.  You should design the lower levels of the stack to
>support both kernel and user clients.  Kernel clients cannot always block, so
>make the design flexible and put the blocking policy at the highest level
>possible.

Currently all destroy calls are blocking because of the hardware drivers.  I
don't see that changing.  What IBAL really provides is an *extremely* convoluted
way of calling 'queue work item'.  That functionality is hidden under calls
named 'destroy' with a multitude of function pointer callbacks that are
difficult to follow.

Overall, the existing WinOF stack is fragile to changes because of all sorts of
non-obvious dependencies in the code.  One of the goals of this patch is to
eliminate some of these dependencies without rewriting major portions of the CM
codebase.

>> I think you're missing that there's nothing that synchronizes access
>> to the QPs CID field during destruction.  To be more specific:
>>
>> void
>> al_destroy_cep(
>>         IN                              ib_al_handle_t h_al, IN      OUT
>>                             net32_t* const p_cid, IN
>>                  boolean_t
>> reusable )
>> {
>>     net32_t             cid = *p_cid;
>>
>> the cid was just read, but without any locks being held.  Any
>> protection that you believe you're getting by passing the QP's CID
>> into other calls doesn't exist.
>
>Yep, that's a (minor) bug alright, though no code assigns the QP's CID member
>except through the CEP calls, so you would never have a problem unless a kernel
>client behaves badly.  Specifically, the only place that the QP's CID is
>assigned from AL_INVALID_CID to a valid CID is in al_create_cep and
>al_cep_pre_rep(_ex), under the protection of the CEP lock.  The only place it
>is changed from a valid value to either AL_INVALID_CID or AL_RESERVED_CID is
>again under the lock.
>
>So while I agree with you that the design could be cleaner, I don't agree that
>it's broken in the current code, and I'll still assert that it is broken in
>your patch.

It doesn't matter if the value is assigned under lock if it isn't also read
under lock.  The qp's cid assignment can occur after being read into cid above,
leaking CEPs.

What specifically does the patch break?  What I've gathered so far are concerns
that it:

* Allows a user to connect or accept on an already connected QP.  Beyond the
user breaking their own communication, what actual harm does this cause?  Does
this leak CEPs?  I removed this check 

* Does not allow a user to connect or accept on a QP while trying to destroy it,
or it leaks CEPs.  Apparently the kernel proxy code, including the ND changes,
can behave this way.

Are leaked CEPs cleaned up when the application exits?

- Sean