[ewg] Re: [ofa-general] questions about OFED 1.2 IPoIB bonding

Michael S. Tsirkin mst at dev.mellanox.co.il
Tue Apr 10 21:15:43 PDT 2007


> Quoting Roland Dreier <rdreier at cisco.com>:
> Subject: Re: [ofa-general] questions about OFED 1.2 IPoIB bonding
> 
>  > > HCA catastrophic errors are either a hardware problem (either a
>  > > transient condition like overheating, or a busted HCA), or a firmware
>  > > bug.
>  > 
>  > Not really, since most kernel code uses the DMA MR,
>  > they can easily be triggered by e.g. incorrect DMA API usage.
>  > I've just seen this with the recent PPC bug.
> 
> Out of curiousity, why does this cause a catastrophic error?  I would
> have thought a work request with a bogus bus address would generate an
> affiliated error, since you know exactly resource what caused the bad
> transaction.

It seems bus controller noticed an illegal transaction and started
aborting all transactions mastered from this misbehaving device.

-- 
MST



More information about the ewg mailing list