***SPAM*** Re: [ofa-general] [PATCH] IB/core: ib_post_send_mad() returns zero but doesn't generate send completion

Hal Rosenstock hal.rosenstock at gmail.com
Tue Mar 3 05:55:15 PST 2009


On Mon, Mar 2, 2009 at 7:57 PM, Ralph Campbell
<ralph.campbell at qlogic.com> wrote:
> On Sun, 2009-03-01 at 17:23 -0800, Hal Rosenstock wrote:
>> On Fri, Feb 27, 2009 at 4:45 PM, Ralph Campbell
>> <ralph.campbell at qlogic.com> wrote:
>>
>> Good catches.
>>
>> > If ib_post_send_mad() returns zero, it guarantees that there will be
>> > a callback to the send_buf->mad_agent->send_handler() so that the
>> > sender can call ib_free_send_mad(). Otherwise, the ib_mad_send_buf
>> > will be leaked and the mad_agent reference count will never go to zero
>> > and the IB device module cannot be unloaded.
>> > The above can happen without this patch if process_mad() returns
>> > (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED).
>>
>> This looks right although I'm not sure why this wasn't seen a long time ago.
>
> mthca returns IB_MAD_RESULT_CONSUMED for TRAP repress.

Also, locally generated traps (which recent OpenSMs use for SMAs not
generating SM priority change traps).

> I verified that using libibumad to send a direct routed TRAP repress with hop=1/cnt=0
> does hang on mthca. I think opensm only sends LID routed TRAP repress
> so that would explain why this wasn't seen earlier.

AFAIK OpenSM responds with trap repress based on the style of the
incoming trap (e.g. LID or direct routed).

> For ipath, it was fairly recently that we made SubnSet(Portinfo) to
> disabled to not send a reply and we have a program that uses libibumad
> to disable ports. Without the patch, the program hangs.
>
>> > If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent
>> > registered to receive the mad being sent, handle_outgoing_dr_smp()
>> > returns zero which causes a MAD packet which is at the end of the
>> > directed route to be incorrectly sent on the wire but doesn't cause
>> > a hang since the HCA generates a send completion.
>>
>> This also looks right.
>>
>> Does the packet really get out on the IB wire ? Was that with a
>> Mellanox HCA ? I don't recall ever seeing these extra packets but
>> maybe they end up swallowed by certain HCAs.
>
> Yes, a packet is sent with either mthca or ipath.

What does it look like ? Is it responded to ?

-- Hal

>> Unfortunately, I have no current way of regressing these changes. I'm
>> presuming you've done this with Mellanox HCAs and with OpenSM running.
>
> opensm runs fine with the patch applied on either mthca or ipath.
>
>> One other question is whether this should be separated into two increments.
>
> I think the changes are small enough and related enough to justify a
> single patch but I will split it into two if someone feels I should.
> - Show quoted text -
>> -- Hal
>>
>> > Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
>> >
>> > diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
>> > index dbcd285..62a99dc 100644
>> > --- a/drivers/infiniband/core/mad.c
>> > +++ b/drivers/infiniband/core/mad.c
>> > @@ -742,9 +742,7 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
>> >                break;
>> >        case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED:
>> >                kmem_cache_free(ib_mad_cache, mad_priv);
>> > -               kfree(local);
>> > -               ret = 1;
>> > -               goto out;
>> > +               break;
>> >        case IB_MAD_RESULT_SUCCESS:
>> >                /* Treat like an incoming receive MAD */
>> >                port_priv = ib_get_mad_port(mad_agent_priv->agent.device,
>> > @@ -755,10 +753,12 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
>> >                                                        &mad_priv->mad.mad);
>> >                }
>> >                if (!port_priv || !recv_mad_agent) {
>> > +                       /*
>> > +                        * No receiving agent so drop packet and
>> > +                        * generate send completion.
>> > +                        */
>> >                        kmem_cache_free(ib_mad_cache, mad_priv);
>> > -                       kfree(local);
>> > -                       ret = 0;
>> > -                       goto out;
>> > +                       break;
>> >                }
>> >                local->mad_priv = mad_priv;
>> >                local->recv_mad_agent = recv_mad_agent;
>> >
>> >
>> > _______________________________________________
>> > general mailing list
>> > general at lists.openfabrics.org
>> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>> >
>> > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>> >
>
>



More information about the general mailing list