[openib-general] Re: Re: [PATCH] ib_mad: prevent duplicateoutstanding MADtransactions with same TID

Michael S. Tsirkin mst at mellanox.co.il
Mon Feb 27 10:39:05 PST 2006


Quoting r. Sean Hefty <mshefty at ichips.intel.com>:
> The only problem that these patches seem to address are inefficiencies 
> processing a duplicate request and sending some extra MADs.  Is there a 
> more severe problem that you can point to?

Note that once you have multiple outstanding transactions with the same
TID/GID/method, the specific RMPP transaction will be sure to fail
since ACKs will get matched to the wrong transaction.

We are actually seeing these failures on big clusters when a
diagnostic tool tries to get a list of all nodes from opensm:
it seems to often retry the request MAD while the SA node gets round
to responding to the first one.

> It also seems that we're unlikely to hit any of these problems with the 
> current ib_sa interface.
> 
> - Sean

Possibly - we are seeing these when posting things all node port info
request on top of umad. The problem appears on the opensm side.


-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies



More information about the general mailing list