[ofa-general] [PATCH] IB/core: ib_post_send_mad() returns zero but doesn't generate send completion

Hal Rosenstock hal.rosenstock at gmail.com
Sun Mar 1 17:23:04 PST 2009


On Fri, Feb 27, 2009 at 4:45 PM, Ralph Campbell
<ralph.campbell at qlogic.com> wrote:

Good catches.

> If ib_post_send_mad() returns zero, it guarantees that there will be
> a callback to the send_buf->mad_agent->send_handler() so that the
> sender can call ib_free_send_mad(). Otherwise, the ib_mad_send_buf
> will be leaked and the mad_agent reference count will never go to zero
> and the IB device module cannot be unloaded.
> The above can happen without this patch if process_mad() returns
> (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED).

This looks right although I'm not sure why this wasn't seen a long time ago.

> If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent
> registered to receive the mad being sent, handle_outgoing_dr_smp()
> returns zero which causes a MAD packet which is at the end of the
> directed route to be incorrectly sent on the wire but doesn't cause
> a hang since the HCA generates a send completion.

This also looks right.

Does the packet really get out on the IB wire ? Was that with a
Mellanox HCA ? I don't recall ever seeing these extra packets but
maybe they end up swallowed by certain HCAs.

Unfortunately, I have no current way of regressing these changes. I'm
presuming you've done this with Mellanox HCAs and with OpenSM running.

One other question is whether this should be separated into two increments.

-- Hal

> Signed-off-by: Ralph Campbell <ralph.campbell at qlogic.com>
>
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index dbcd285..62a99dc 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -742,9 +742,7 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
>                break;
>        case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED:
>                kmem_cache_free(ib_mad_cache, mad_priv);
> -               kfree(local);
> -               ret = 1;
> -               goto out;
> +               break;
>        case IB_MAD_RESULT_SUCCESS:
>                /* Treat like an incoming receive MAD */
>                port_priv = ib_get_mad_port(mad_agent_priv->agent.device,
> @@ -755,10 +753,12 @@ static int handle_outgoing_dr_smp(struct ib_mad_agent_private *mad_agent_priv,
>                                                        &mad_priv->mad.mad);
>                }
>                if (!port_priv || !recv_mad_agent) {
> +                       /*
> +                        * No receiving agent so drop packet and
> +                        * generate send completion.
> +                        */
>                        kmem_cache_free(ib_mad_cache, mad_priv);
> -                       kfree(local);
> -                       ret = 0;
> -                       goto out;
> +                       break;
>                }
>                local->mad_priv = mad_priv;
>                local->recv_mad_agent = recv_mad_agent;
>
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>



More information about the general mailing list