> HCA catastrophic errors are either a hardware problem (either a > transient condition like overheating, or a busted HCA), or a firmware > bug. Not really, since most kernel code uses the DMA MR, they can easily be triggered by e.g. incorrect DMA API usage. I've just seen this with the recent PPC bug. -- MST