[ewg] Re: [ofa-general] questions about OFED 1.2 IPoIB bonding

Michael S. Tsirkin mst at dev.mellanox.co.il
Wed Apr 11 00:08:48 PDT 2007


> Quoting Roland Dreier <rdreier at cisco.com>:
> Subject: Re: [ofa-general] questions about OFED 1.2 IPoIB bonding
> 
>  > > Out of curiousity, why does this cause a catastrophic error?  I would
>  > > have thought a work request with a bogus bus address would generate an
>  > > affiliated error, since you know exactly resource what caused the bad
>  > > transaction.
> 
>  > It seems bus controller noticed an illegal transaction and started
>  > aborting all transactions mastered from this misbehaving device.
> 
> I see, that's not really a true catastrophic error -- the mthca code
> will report it as one, because polling the the error buffer will get
> back all 0xffffffff, but that's just because the HCA has been isolated
> from the PCI bus.

No, a read from the error buffer is not mastered at the HCA,
so the error buffer actually gets real values.
What triggers a catastrophic error is that HCA attempts to perform
a transaction such as reading command inbox, and *that* fails.

-- 
MST



More information about the ewg mailing list