[Users] Firmware upgrade appears to have broken driver
Orion Poplawski
orion at cora.nwra.com
Mon Apr 29 14:35:02 PDT 2013
Unfortunately it seems I paid too much attention to this message:
ib_mthca 0000:01:00.0: HCA FW version 4.7.400 is old (4.8.200 is current).
ib_mthca 0000:01:00.0: If you have problems, try updating your HCA FW.
So I updated my MHGA28-1TC A1s firmware with:
mstflint -d 01:00.0 -i fw-25208-4_8_200-MHGA28-1TC_A1-A3.bin -nofs burn
Now the driver fails to load with:
ib_mthca: Mellanox InfiniBand HCA driver v1.0 (April 4, 2008)
ib_mthca: Initializing 0000:01:00.0
alloc irq_desc for 18 on node -1
alloc kstat_irqs on node -1
ib_mthca 0000:01:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
ib_mthca 0000:01:00.0: setting latency timer to 64
ib_mthca 0000:01:00.0: SYS_EN command returned -22, aborting.
ib_mthca 0000:01:00.0: PCI INT A disabled
ib_mthca: probe of 0000:01:00.0 failed with error -22
The firmware appears to checkout fine:
# mstflint -d 01:00.0 v
Failsafe image:
Invariant /0x00000028-0x0000095f (0x000938)/ (BOOT2) - OK
Primary Image /0x00020000-0x00020107 (0x000108)/ (Pointer Sector)- OK
/0x00060028-0x000608af (0x000888)/ (BOOT2) - OK
/0x000608b0-0x0006504b (0x00479c)/ (BOOT2) - OK
/0x0006504c-0x00065f5f (0x000f14)/ (Configuration) - OK
/0x00065f60-0x00065f93 (0x000034)/ (GUID) - OK
/0x00065f94-0x0006eed7 (0x008f44)/ (DDR) - OK
/0x0006eed8-0x0007d7af (0x00e8d8)/ (DDR) - OK
/0x0007d7b0-0x0008104f (0x0038a0)/ (DDR) - OK
/0x00081050-0x00082b3f (0x001af0)/ (DDR) - OK
/0x00082b40-0x0009bc17 (0x0190d8)/ (DDR) - OK
/0x0009bc18-0x000b0387 (0x014770)/ (DDR) - OK
/0x000b0388-0x000b039b (0x000014)/ (Configuration) - OK
/0x000b039c-0x000b03df (0x000044)/ (Jump addresses) - OK
/0x000b03e0-0x000b05db (0x0001fc)/ (FW Configuration) - OK
Secondary Image /0x00040000-0x00040107 (0x000108)/ (Pointer Sector)- OK
/0x000c0028-0x000c08af (0x000888)/ (BOOT2) - OK
/0x000c08b0-0x000c504b (0x00479c)/ (BOOT2) - OK
/0x000c504c-0x000c5f5f (0x000f14)/ (Configuration) - OK
/0x000c5f60-0x000c5f93 (0x000034)/ (GUID) - OK
/0x000c5f94-0x000ceed7 (0x008f44)/ (DDR) - OK
/0x000ceed8-0x000dd7af (0x00e8d8)/ (DDR) - OK
/0x000dd7b0-0x000e104f (0x0038a0)/ (DDR) - OK
/0x000e1050-0x000e2b3f (0x001af0)/ (DDR) - OK
/0x000e2b40-0x000fbc17 (0x0190d8)/ (DDR) - OK
/0x000fbc18-0x00110387 (0x014770)/ (DDR) - OK
/0x00110388-0x0011039b (0x000014)/ (Configuration) - OK
/0x0011039c-0x001103df (0x000044)/ (Jump addresses) - OK
/0x001103e0-0x001105db (0x0001fc)/ (FW Configuration) - OK
FW image verification succeeded. Image is bootable.
Any ideas what might be causing this?
--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA, Boulder/CoRA Office FAX: 303-415-9702
3380 Mitchell Lane orion at nwra.com
Boulder, CO 80301 http://www.nwra.com
More information about the Users
mailing list