[openib-general] CM and REP handling
Sean Hefty
mshefty at ichips.intel.com
Fri Jun 30 14:52:07 PDT 2006
Rimmer, Todd wrote:
> I would recommend implementing the state machine as defined in the spec
> for the following reasons:
Technically, I believe that this follows the state machine. After receiving a
duplicate REQ, a REP will be resent. The only difference is that there is a
delay in resending the REP.
> 1. it will be necessary to pass any future IBTA CIWG compliance tests
> for the CM
I don't believe that a compliance test would detect any issue.
> 2. I would need to think about it, but the lost REP case may not be the
> only situation where a duplicate REQ can be received.
Note that the IB CM handles duplicate REQs differently based on the current state.
> 3. depending on RTU timeout on the passive side as the only means for
> resending the REP reduces the retries attempted in a "lossy" fabric for
> REP and RTU loss (eg. if you have 8 RTU timeout retries on passive side,
> and many REPs are lost followed by many RTUs, you get a total of 8 lost
> REPs+RTUs before you give up, managing the counters separately will tend
> allow for more retries).
The number of retries cannot exceed the maximum CM retries that was specified in
the REQ. Resending a REP immediately after receiving a duplicate REQ needs to
check against this and increment the number of REPs that have been sent. The
result is that the connection timeout actually decreases for every duplicate REQ
that is received.
- Sean
More information about the general
mailing list