[openib-general] Send Side RMPP and OpenSM GetTableResp
Hal Rosenstock
halr at voltaire.com
Tue May 31 20:58:15 PDT 2005
On Tue, 2005-05-31 at 22:05, Sean Hefty wrote:
> >> > <-- SA GetTableResp
> >> >
> >> > RMPP flags 0x05 (Data, Last)
> >> > SegmentNumber 4
> >> > PayloadLength 0x34
> >> > TID 8
> >> >SA GetTable -->
> >> >RMPP flags 0x02 (ACK)
> >> >SegmentNumber 1
> >> >NewWindowLast 6
> >> >TID 8
> >>
> >> This segment number is off - not sure why.
> >
> >It is off in that the 3 segments just sent are not acknowledged but it
> >is legal to acknowledge what you have already received. This does not
> >violate anything.
>
> The RMPP implementation sends an ACK under the following conditions:
>
> * Upon completion of a received datagram.
> * If a duplicate segment is receive.
> * After all segments of the current window are received
> (including the initial window)
>
> So, this ACK isn't violating the protocol, but I don't see which of these
> cases the ACK matches up against in the implementation.
That's (the ACK) not from OpenIB but from the Solaris 10 SA client.
> The code on the send side calculates the total segment number using both the
> PayloadLength and sge.length field. If either is off, the sender side could
> probably be thrown off in its calculations. Even if this were the case, I
> still can't see what would cause segment number 5 to be transmitted...
Perhaps there is something wrong with umad in terms of this but it's
hard to see what as it just posts the send MAD built with
ib_create_send_mad.
> >> This segment should have been dropped by the client as an invalid segment
> >> number.
> >
> >It's not invalid, is it ? Just a repeat. Should it reset one of the RMPP
> >timers too ?
I was referring to the reACK from the client not the retransmitted data
segment from the SA which has the wrong segment number).
> If segment 4 had the last bit set, segment 5 is invalid. The RMPP code
> should drop this.
Right. Is just dropping sufficient ? It looks to me that the receiver
should if it is not the expected segment also send ACK for ES - 1 per
Figure 178. [There was more to the sequence which I omitted; I only
showed up to the point where things looked like they went wrong on the
SA side.]
> >I will try to get back to gathering more info on this.
>
> Having some more info would help, but I can also try modifying grmpp to see
> if I can reproduce this. My intention is to focus on finding a fix for the
> MAD problems at the moment, however, so I'll queue this up to look at it
> when I get back to RMPP.
OK. I'll try to get more info so this can be more focused.
-- Hal
More information about the general
mailing list