[openib-general] RFC: detecting duplicate MAD requests
Hal Rosenstock
halr at voltaire.com
Tue Jun 13 15:26:34 PDT 2006
On Tue, 2006-06-13 at 17:58, Sean Hefty wrote:
> >> Assuming minimal hard-coding of which methods are requests, a client would
> >drop
> >> only about 1 MAD per method during start-up.
> >
> >Is this only the new methods which are not hard coded ? Would this
> >invoke a timeout (and hopefully retry) ?
>
> We can hard-code existing methods to avoid this problem. So only unknown
> methods would be affected, which would affect user-defined classes more than the
> existing classes.
I would expect vendor classes to follow the standard methods unless they
need something different.
> In most cases, I would expect the sender to timeout and retry the request, which
> hopefully comes after the request table has been updated.
>
> >> And I
> >> would argue that even if a request has been acknowledged, the sender of the
> >> request would still need to deal with the case that no response is ever
> >> generated.
> >
> >Are you referring to a request being acknowledged but the response is
> >not sent (yet) ?
>
> Yes.
>
> >> My current thoughts on how to handle requests are to time when each request
> >MAD
> >> is received, and queue it. Once the queue is full, if another request is
> >> received, it would check the MAD at the head of the queue. If the MAD at the
> >> head was older than some selected value (say 20 seconds), it would be bumped
> >> from the queue, and the new request would be added to the tail.
> >
> >For RMPP, this time should start when the last segment is received. Is
> >that how you would envision it working ?
>
> Correct. Part of the motivation here is if a client cannot or will not generate
> a response for some reason, we don't want to keep the MAD hanging around
> forever.
>
> >I'm also not sure what the right timeout value would be for this. Where
> >did 20 seconds come from ?
>
> I just made that up. Something like this would probably have to be adaptable,
> and would likely depend on the size of the fabric. In most cases, I would guess
> that a timeout indicates some sort of error in the client, so I would tend
> towards a larger timeout.
Is the only downside of a larger timeout that potentially more memory
accumulates (until the timeout occurs) before it is freed ?
-- Hal
> - Sean
More information about the general
mailing list