[ofa-general] ib_mthca Catastrophic errors
    Roland Dreier 
    rdreier at cisco.com
       
    Fri Jun  5 07:01:00 PDT 2009
    
    
  
 > kernel: ib_mthca 0000:06:00.0: Catastrophic error detected: unknown error
 > kernel: ib_mthca 0000:06:00.0:   buf[00]: ffffffff
Looks like an error on the PCI bus.
 > kernel: ib_mthca 0000:01:00.0: Catastrophic error detected: internal parity error
 > kernel: ib_mthca 0000:01:00.0:   buf[00]: 05000000
probably what it says it is -- a parity error inside the HCA.
Both point to a physical problem to me -- HCA not perfectly seated in
PCI slot, power supply flaky, thermal issue, something like that.
 - R.
    
    
More information about the general
mailing list