[openib-general] Re: [PATCH] ib_mad: prevent duplicateoutstanding MADtransactions with same TID

Sean Hefty mshefty at ichips.intel.com
Mon Feb 27 10:16:03 PST 2006


Jack Morgenstein wrote:
>>I was thinking about how we can reduce some of the inefficiencies. 
>>Currently, we only track if a request is waiting for a response or not.  We
>>can add a new state indicating that a response is in progress, which would
>>be set when the first segment of a response is received.  This would be
>>used to suppress duplicate requests.
> 
> There is still a race condition here.  The duplicate request could go out 
> while the first response segment is in transit (particularly true for 
> requests which generate huge responses!).
> 
> This fix is OK, but does not absolve the responding side from checking as 
> well.

Yes - I'm aware that there's still a race here, but I don't see a way around it 
on the send side.  The goal here is to reduce some of the inefficiencies.

Consider the existing ib_sa.h interface.  A request is not automatically retried 
by the MAD layer, so will time out after 1 attempt.  The RMPP response will be 
reassembled, then tossed.  The user may retry the request, but will receive 
another TID when doing so.

>>On the receive side, I was considering adding an API that the user would
>>invoke to indicate that a response was being generated.  The MAD layer
>>would queue this information, and a received request would be checked
>>against this queue to determine if it were a duplicate.  When the response
>>is sent, the queued information would be removed.  I think that we may be
>>able to use such an API to support dual-sided RMPP as well.
> 
> There is still a race here -- between user indicating that a response will be
> generated, and a new request arriving.  Not serious, though since presumably
> a duplicate request will only be issued after a significant timeout (seconds), 
> and this API would be invoked immediately.
> This does demand changes in user code, which checking at the mad-send
> time does not.

Unless the ULP does all the checking, there will always be a race.  What this 
does do is decrease the size of the window for detecting duplicate requests.  We 
want to detect duplicate requests sooner to avoid as much processing as possible.

This also leaves control of request-response handling to the ULP.

> I recommend that we use the mad duplicate RMPP send patch now (since it -- or 
> something like it -- will still be needed when we do handling at the 
> requester side, and at the receive side of the responder).  This fix is 
> admittedly incomplete, since it not as efficient as I would like (e.g., the 
> duplicate request is still processed, and is thrown out only after all the 
> processing is complete) -- but it does fix the problem.

The only problem that these patches seem to address are inefficiencies 
processing a duplicate request and sending some extra MADs.  Is there a more 
severe problem that you can point to?

It also seems that we're unlikely to hit any of these problems with the current 
ib_sa interface.

- Sean



More information about the general mailing list