[ewg] Re: [ofa-general] questions about OFED 1.2 IPoIB bonding
Michael S. Tsirkin
mst at dev.mellanox.co.il
Tue Apr 10 21:15:43 PDT 2007
> Quoting Roland Dreier <rdreier at cisco.com>:
> Subject: Re: [ofa-general] questions about OFED 1.2 IPoIB bonding
>
> > > HCA catastrophic errors are either a hardware problem (either a
> > > transient condition like overheating, or a busted HCA), or a firmware
> > > bug.
> >
> > Not really, since most kernel code uses the DMA MR,
> > they can easily be triggered by e.g. incorrect DMA API usage.
> > I've just seen this with the recent PPC bug.
>
> Out of curiousity, why does this cause a catastrophic error? I would
> have thought a work request with a bogus bus address would generate an
> affiliated error, since you know exactly resource what caused the bad
> transaction.
It seems bus controller noticed an illegal transaction and started
aborting all transactions mastered from this misbehaving device.
--
MST
More information about the ewg
mailing list