[openib-general] Catastrophic error detected.

Ira Weiny weiny2 at llnl.gov
Wed Oct 18 13:13:17 PDT 2006


I got the following error running with OFED 1.1 on a modified 2.6.9 RHEL4
kernel.  Hal mentioned that there might be a catastrophic error recovery patch
submitted since then?  I can't find a mention of that in the mailing list.  If
possible I would like to try such a patch.

Thanks,
Ira

2006-10-17 21:31:47 ib_mthca 0000:07:00.0: Catastrophic error detected: unknown error
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[00]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[01]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[02]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[03]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[04]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[05]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[06]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[07]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[08]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[09]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[0a]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[0b]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[0c]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[0d]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[0e]: ffffffff
2006-10-17 21:31:47 ib_mthca 0000:07:00.0:   buf[0f]: ffffffff

# rhea277 /root > /sbin/lspci -vv -s 07:00.0
07:00.0 InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (rev 20)
        Subsystem: Mellanox Technologies MT25208 InfiniHost III Ex
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Interrupt: pin A routed to IRQ 217
        Region 0: Memory at dff00000 (64-bit, non-prefetchable) [disabled] [size=1M]
        Region 2: Memory at de800000 (64-bit, prefetchable) [disabled] [size=8M]
        Capabilities: [40] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] Vital Product Data
        Capabilities: [90] Message Signalled Interrupts: 64bit+ Queue=0/5 Enable-
                Address: 0000000000000000  Data: 0000
        Capabilities: [84] MSI-X: Enable- Mask- TabSize=32
                Vector table: BAR=0 offset=00082000
                PBA: BAR=0 offset=00082200
        Capabilities: [60] Express Endpoint IRQ 0
                Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
                Device: Latency L0s <64ns, L1 unlimited
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
                Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
                Link: Supported Speed 2.5Gb/s, Width x8, ASPM L0s, Port 8
                Link: Latency L0s unlimited, L1 unlimited
                Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
                Link: Speed 2.5Gb/s, Width x8








More information about the general mailing list