[ewg] Re: [ofa-general] questions about OFED 1.2 IPoIB bonding
Michael S. Tsirkin
mst at dev.mellanox.co.il
Wed Apr 11 00:08:48 PDT 2007
> Quoting Roland Dreier <rdreier at cisco.com>:
> Subject: Re: [ofa-general] questions about OFED 1.2 IPoIB bonding
>
> > > Out of curiousity, why does this cause a catastrophic error? I would
> > > have thought a work request with a bogus bus address would generate an
> > > affiliated error, since you know exactly resource what caused the bad
> > > transaction.
>
> > It seems bus controller noticed an illegal transaction and started
> > aborting all transactions mastered from this misbehaving device.
>
> I see, that's not really a true catastrophic error -- the mthca code
> will report it as one, because polling the the error buffer will get
> back all 0xffffffff, but that's just because the HCA has been isolated
> from the PCI bus.
No, a read from the error buffer is not mastered at the HCA,
so the error buffer actually gets real values.
What triggers a catastrophic error is that HCA attempts to perform
a transaction such as reading command inbox, and *that* fails.
--
MST
More information about the ewg
mailing list