[openib-general] Send Side RMPP and OpenSM GetTableResp

Hal Rosenstock halr at voltaire.com
Tue May 31 20:58:15 PDT 2005


On Tue, 2005-05-31 at 22:05, Sean Hefty wrote: 
> >> >            <--    SA GetTableResp
> >> >
> >> >					RMPP flags 0x05 (Data, Last)
> >> >					SegmentNumber 4
> >> >					PayloadLength 0x34
> >> >					TID 8
> >> >SA GetTable -->
> >> >RMPP flags 0x02 (ACK)
> >> >SegmentNumber 1
> >> >NewWindowLast 6
> >> >TID 8
> >>
> >> This segment number is off - not sure why.
> >
> >It is off in that the 3 segments just sent are not acknowledged but it
> >is legal to acknowledge what you have already received. This does not
> >violate anything.
> 
> The RMPP implementation sends an ACK under the following conditions:
> 
> * Upon completion of a received datagram.
> * If a duplicate segment is receive.
> * After all segments of the current window are received
>   (including the initial window)
> 
> So, this ACK isn't violating the protocol, but I don't see which of these
> cases the ACK matches up against in the implementation.

That's (the ACK) not from OpenIB but from the Solaris 10 SA client.

> The code on the send side calculates the total segment number using both the
> PayloadLength and sge.length field.  If either is off, the sender side could
> probably be thrown off in its calculations.  Even if this were the case, I
> still can't see what would cause segment number 5 to be transmitted...

Perhaps there is something wrong with umad in terms of this but it's
hard to see what as it just posts the send MAD built with
ib_create_send_mad.

> >> This segment should have been dropped by the client as an invalid segment
> >> number.
> >
> >It's not invalid, is it ? Just a repeat. Should it reset one of the RMPP
> >timers too ?

I was referring to the reACK from the client not the retransmitted data
segment from the SA which has the wrong segment number).

> If segment 4 had the last bit set, segment 5 is invalid.  The RMPP code
> should drop this.

Right. Is just dropping sufficient ? It looks to me that the receiver
should if it is not the expected segment also send ACK for ES - 1 per
Figure 178. [There was more to the sequence which I omitted; I only
showed up to the point where things looked like they went wrong on the
SA side.]

> >I will try to get back to gathering more info on this.
> 
> Having some more info would help, but I can also try modifying grmpp to see
> if I can reproduce this.  My intention is to focus on finding a fix for the
> MAD problems at the moment, however, so I'll queue this up to look at it
> when I get back to RMPP.

OK. I'll try to get more info so this can be more focused.

-- Hal




More information about the general mailing list