[ofa-general] Re: IPoIB CM (NOSRQ) [PATCH 1] review

Tue Sep 18 14:01:21 PDT 2007

> 
>> We compute the mask NOSRQ_INDEX_MASK based on max_rc_qp. This is used
>> to compute the wr_id through a bitwise AND. Hence we need that to be a
>> power of 2.
> 
> I'm saying that we don't need to restrict the number of QPs to a power
> of 2.  We only need to restrict it to less than 2^(number of bits that
> we want to dedicate from the wr_id to find the QP).  E.g. it's okay to
> have 4-bit or 30-bit masks, but only support 12 QPs.

OK. Got it, I will unlink the two.

....

> 
>>>> +        ipoib_warn(priv, "cm recv completion event with wrid %lld (>
>>>> %d)\n",
>>>> +                   (unsigned long long)wr_id, ipoib_recvq_size);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    index = (wc->wr_id & ~IPOIB_CM_OP_RECV) & NOSRQ_INDEX_MASK;
>>>> +
>>>> +    /* This is the only place where rx_ptr could be a NULL - could
>>>> +     * have just received a packet from a connection that has become
>>>> +     * stale and so is going away. We will simply drop the packet and
>>>> +     * let the hardware (it s IB_QPT_RC) handle the dropped packet.
>>
>>> I don't understand this comment.  How can the hardware handle a packet
>>> dropped by software?
>>
>>
>> Under the conditions described we drop the packet and since it is an RC
>> connection, the remote side will detect a timeout and the hardware
>> will detect it and automatically initiate a retransmission -till a
>> RETRY_EXCEEDED
>> error occurs.
> 
> This still doesn't make sense to me.  An ACK was already generated by
> the local hardware.  Tossing the receive doesn't cause the remote
> hardware to resend the packet.

So, at the hardware level an ack goes through, but we drop it at the
software level. Is there any way we can force the remote end to resend?
TCP should be OK. What about UDP? Do we depend upon the application
at the remote end?

Would it be more appropriate that I rephrase it something along the lines ...
"We will simply drop the packet and let the remote end handle the dropped packet"

> 
>>> If the completion can be for a connection that has gone away, what's to
>>> prevent a new connection from grabbing the same slot in the
>>> rx_index_table.  If this occurs, then the completion will reference the
>>> wrong connection.
>>
>>
>> It does not matter if after a connection has gone away if a new
>> connection grabs
>> the same slot (that is likely to happen with the linear search). If
>> the old
>> connection comes back it will get a new slot in the rx_index_tabe.
> 
> Yes - but a receive for the old connection will reference the rx_table
> index for the new connection.  See below:
> 
>>>> +     * In the timer_check() function below, p->jiffies is updated and
>>>> +     * hence the connection will not be stale after that.
>>>> +     */
>>>> +    rx_ptr = priv->cm.rx_index_table[index];
>>>> +    if (unlikely(!rx_ptr)) {
>>>> +        ipoib_warn(priv, "Received packet from a connection "
>>>> +               "that is going away. Hardware will handle it.\n");
>>>> +        return;
>>>> +    }
> 
> If this check can ever succeed, then it's also possible for rx_ptr to
> reference the wrong connection.  rx_table[index] should not be freed
> until all receives associated with that QP have been processed.

rx_index_table[index] is freed only in the stale task. So, that means
all receives have been processed by this time.