[openib-general] SM Bad Port Handling
Hal Rosenstock
halr at voltaire.com
Thu Apr 14 03:32:02 PDT 2005
On Thu, 2005-04-14 at 02:29, Eitan Zahavi wrote:
> The point is that the real "bad" ports are not the ones that are
> killing 100% of packets
> (since they will simply have a "DOWN" state and vanish).
>
> The real bad ports are the ones that pass < 25% (as we use retry of 4)
> of packets that goes through them.
When the SM sends a direct route MAD it saves the port guid (and port
num) in the madw context, so that when there is a reply or timeout you
can easily find the port. That means you dont have to walk the entire DR
path to find the unhealthy port. That means that the peer port (from
which we arrived to the bad port) is unhealthy. Does this address your
concern ?
-- Hal
More information about the general
mailing list