[openib-general] RFC: detecting duplicate MAD requests

Tue Jun 13 14:58:33 PDT 2006

>> Assuming minimal hard-coding of which methods are requests, a client would
>drop
>> only about 1 MAD per method during start-up.
>
>Is this only the new methods which are not hard coded ? Would this
>invoke a timeout (and hopefully retry) ?

We can hard-code existing methods to avoid this problem.  So only unknown
methods would be affected, which would affect user-defined classes more than the
existing classes.

In most cases, I would expect the sender to timeout and retry the request, which
hopefully comes after the request table has been updated.

>> And I
>> would argue that even if a request has been acknowledged, the sender of the
>> request would still need to deal with the case that no response is ever
>> generated.
>
>Are you referring to a request being acknowledged but the response is
>not sent (yet) ?

Yes.

>> My current thoughts on how to handle requests are to time when each request
>MAD
>> is received, and queue it.  Once the queue is full, if another request is
>> received, it would check the MAD at the head of the queue.  If the MAD at the
>> head was older than some selected value (say 20 seconds), it would be bumped
>> from the queue, and the new request would be added to the tail.
>
>For RMPP, this time should start when the last segment is received. Is
>that how you would envision it working ?

Correct.  Part of the motivation here is if a client cannot or will not generate
a response for some reason, we don't want to keep the MAD hanging around
forever.

>I'm also not sure what the right timeout value would be for this. Where
>did 20 seconds come from ?

I just made that up.  Something like this would probably have to be adaptable,
and would likely depend on the size of the fabric.  In most cases, I would guess
that a timeout indicates some sort of error in the client, so I would tend
towards a larger timeout.

- Sean