[openib-general] [PATCH v2 1/2] iWARP Connection Manager.

Wed Jun 14 11:06:44 PDT 2006

netdev-owner at vger.kernel.org wrote:
> On Tue, 2006-06-13 at 16:46 -0500, Steve Wise wrote:
>> On Tue, 2006-06-13 at 14:36 -0700, Sean Hefty wrote:
>>>>> Er...no. It will lose this event. Depending on the event...the
>>>>> carnage varies. We'll take a look at this.
>>>>> 
>>>> 
>>>> This behavior is consistent with the Infiniband CM (see
>>>> drivers/infiniband/core/cm.c function cm_recv_handler()).  But I
>>>> think we should at least log an error because a lost event will
>>>> usually stall the rdma connection.
>>> 
>>> I believe that there's a difference here.  For the Infiniband CM, an
>>> allocation error behaves the same as if the received MAD were lost
>>> or dropped.  Since MADs are unreliable anyway, it's not so much that
>>> an IB CM event gets lost, as it doesn't ever occur.  A remote CM
>>> should retry the send, which hopefully allows the
> connection to make forward progress.
>>> 
>> 
>> hmm.  Ok.  I see.  I misunderstood the code in cm_recv_handler().
>> 
>> Tom and I have been talking about what we can do to not drop the
>> event. Stay tuned.
> 
> Here's a simple solution that solves the problem:
> 
> For any given cm_id, there are a finite (and small) number of
> outstanding CM events that can be posted.  So we just
> pre-allocate them when the cm_id is created and keep them on
> a free list hanging off of the cm_id struct.  Then the event
> handler function will pull from this free list.
> 
> The only case where there is any non-finite issue is on the
> passive listening cm_id.  Each incoming connection request
> will consume a work struct.  So based on client connects, we
> could run out of work structs.
> However, the CMA has the concept of a backlog, which is
> defined as the max number of pending unaccepted connection
> requests.  So we allocate these work structs based on that
> number (or a computation based on that number), and if we run
> out, we simply drop the incoming connection request due to
> backlog overflow (I suggest we log the drop event too).
> When a MPA connection request is dropped, the (IETF
> conforming) MPA client will eventually time out the
> connection and the consumer can retry.
> 
> Comments?
> 

If the IWCM cannot accept a Connection Request event from
the driver then *someone* should generate a non-peer reject
MPA Response frame. Since the IWCM does not have the resources
to relay the event, it probably does not have the resources
to generate the MPA Response frame either. So simply returning
an "I'm Busy" error and expecting the driver to handle it
makes sense to me.