[openib-general] RFC: detecting duplicate MAD requests

Hal Rosenstock halr at voltaire.com
Tue Jun 13 15:26:34 PDT 2006


On Tue, 2006-06-13 at 17:58, Sean Hefty wrote:
> >> Assuming minimal hard-coding of which methods are requests, a client would
> >drop
> >> only about 1 MAD per method during start-up.
> >
> >Is this only the new methods which are not hard coded ? Would this
> >invoke a timeout (and hopefully retry) ?
> 
> We can hard-code existing methods to avoid this problem.  So only unknown
> methods would be affected, which would affect user-defined classes more than the
> existing classes.

I would expect vendor classes to follow the standard methods unless they
need something different.

> In most cases, I would expect the sender to timeout and retry the request, which
> hopefully comes after the request table has been updated.
> 
> >> And I
> >> would argue that even if a request has been acknowledged, the sender of the
> >> request would still need to deal with the case that no response is ever
> >> generated.
> >
> >Are you referring to a request being acknowledged but the response is
> >not sent (yet) ?
> 
> Yes.
> 
> >> My current thoughts on how to handle requests are to time when each request
> >MAD
> >> is received, and queue it.  Once the queue is full, if another request is
> >> received, it would check the MAD at the head of the queue.  If the MAD at the
> >> head was older than some selected value (say 20 seconds), it would be bumped
> >> from the queue, and the new request would be added to the tail.
> >
> >For RMPP, this time should start when the last segment is received. Is
> >that how you would envision it working ?
> 
> Correct.  Part of the motivation here is if a client cannot or will not generate
> a response for some reason, we don't want to keep the MAD hanging around
> forever.
> 
> >I'm also not sure what the right timeout value would be for this. Where
> >did 20 seconds come from ?
> 
> I just made that up.  Something like this would probably have to be adaptable,
> and would likely depend on the size of the fabric.  In most cases, I would guess
> that a timeout indicates some sort of error in the client, so I would tend
> towards a larger timeout.

Is the only downside of a larger timeout that potentially more memory
accumulates (until the timeout occurs) before it is freed ?

-- Hal

> - Sean





More information about the general mailing list