[ofa-general] ib_mthca Catastrophic errors
Roland Dreier
rdreier at cisco.com
Fri Jun 5 07:01:00 PDT 2009
> kernel: ib_mthca 0000:06:00.0: Catastrophic error detected: unknown error
> kernel: ib_mthca 0000:06:00.0: buf[00]: ffffffff
Looks like an error on the PCI bus.
> kernel: ib_mthca 0000:01:00.0: Catastrophic error detected: internal parity error
> kernel: ib_mthca 0000:01:00.0: buf[00]: 05000000
probably what it says it is -- a parity error inside the HCA.
Both point to a physical problem to me -- HCA not perfectly seated in
PCI slot, power supply flaky, thermal issue, something like that.
- R.
More information about the general
mailing list