[ofa-general] Re: IPOIB CM (NOSRQ) extension
Pradeep Satyanarayana
pradeeps at linux.vnet.ibm.com
Mon Jun 11 11:08:47 PDT 2007
Michael S. Tsirkin wrote:
>> Quoting Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>:
>> Subject: IPOIB CM (NOSRQ) extension
>>
>> This patch handles the corner case of running out of RC QPs. In that
>> case it switches to UD mode. This patch can be used both by NOSRQ and
>> SRQ code.
>>
>> Signed-off-by: Pradeep Satyanarayana <pradeeps at linux.vnet.ibm.com>
>
> You don't provide any way to retry going back to connected mode,
> after a failure, which is really intermittent by nature. That's pretty bad.
This node switched to datagram mode, because the passive side was
under a resource crunch (no RC QPs). And, the user is indeed alerted
about this condition. So, yes we do not attempt to go back to connected
mode.
>
>> ---
>>
>> --- c/linux-2.6.22-rc3/drivers/infiniband/ulp/ipoib/ipoib_cm.c
>> 2007-06-07 11:13:55.000000000 -0400
>> +++ b/linux-2.6.22-rc3/drivers/infiniband/ulp/ipoib/ipoib_cm.c
>> 2007-06-07 11:11:21.000000000 -0400
>> @@ -1383,6 +1383,11 @@ static int ipoib_cm_tx_handler(struct ib
>> break;
>> case IB_CM_REQ_ERROR:
>> case IB_CM_REJ_RECEIVED:
>> + ipoib_warn(priv, "REJ received\n");
>> + neigh = tx->neigh;
>> + if (neigh)
>> + clear_bit(IPOIB_FLAG_OPER_UP, &neigh->cm->flags);
>> + break;
>> case IB_CM_TIMEWAIT_EXIT:
>> ipoib_dbg(priv, "CM error %d.\n", event->event);
>> spin_lock_irq(&priv->tx_lock);
>
> This has an effect of dropping down to datagram mode
> on errors such as CM timeout, or a reject due to stale connection.
> I think this is a wrong thing to do.
I can make this conditional upon there being no RC QPs. Will code that
up in the next patch.
Pradeep
More information about the general
mailing list