[openib-general] HP ZX1 and HP IB cards...

Grant Grundler iod00d at hp.com
Fri Dec 3 10:27:11 PST 2004


On Tue, Nov 23, 2004 at 02:56:24PM -0800, Grant Grundler wrote:
> So the adventure continues on a different box (rx4640).
> (I'll go back to the rx2600 and reflash/reboot the box).
> 
> With tvflash, I was able to upload the hca-cougar image I mentioned
> before successfully...at least that's what tvflash asserted.

So it turns out I had flash the "high profile" firmware
to a "low profile" card...it didn't like that. That's why
the 3rd BAR was visible but not responding to programming.

I've update the source tree and tried again with a high profile
card (also reflashed with topspin firmware) and still getting
the same error:
Linux iowa 2.6.10-rc2 #10 SMP Fri Dec 3 08:06:24 PST 2004 ia64 GNU/Linux
iowa:~# modprobe ib_mthca
ib_mthca: Mellanox InfiniBand HCA driver v0.06-pre (November 8, 2004)
ib_mthca: Initializing Mellanox Technology MT23108 InfiniHost (0000:41:00.0)
GSI 38 (level, low) -> CPU 1 (0x0100) vector 66
ACPI: PCI interrupt 0000:41:00.0[A] -> GSI 38 (level, low) -> IRQ 66
ib_mthca 0000:41:00.0: Unhandled event 0f(00) on eqn 3
ib_query_gid failed (-16) for mthca0 (index 12)
ib_query_port failed (-16) for mthca0
ib_mthca 0000:41:00.0: WRITE_MTT failed (-16)
ib_mad: Couldn't create ib_mad CQ
ib_mad: Couldn't open mthca0 port 1
ib_agent: Port 1 not found
ib_mad: Couldn't close mthca0 port 1 for agents
ib_mad: Port 1 not found
ib_mad: Couldn't close mthca0 port 1
ib_agent: Port 2 not found
ib_mad: Couldn't close mthca0 port 2 for agents
ib_mad: Port 2 not found
ib_mad: Couldn't close mthca0 port 2
iowa:~# lspci -vs 0000:41:00.0
lspci: -f: Invalid slot number
iowa:~# lspci -vs 41:00.0
0000:41:00.0 InfiniBand: Mellanox Technology MT23108 InfiniHost (rev a1)
	Subsystem: Mellanox Technology MT23108 InfiniHost
	Flags: 66MHz, medium devsel, IRQ 66
	Memory at 00000000b0800000 (64-bit, non-prefetchable) [size=1M]
	Memory at 00000000b0000000 (64-bit, prefetchable) [size=8M]
	Memory at 00000000a0000000 (64-bit, prefetchable) [size=256M]
	Capabilities: [40] #11 [001f]
	Capabilities: [50] Vital Product Data
	Capabilities: [60] Message Signalled Interrupts: 64bit+ Queue=0/5 Enable-
	Capabilities: [70] PCI-X non-bridge device.


I'm fighting other issues right now and haven't been able to work
on this (^#%$ tulip driver). If anyone has advice on how to proceed
debugging this or needs more info, I can use it.

I'm still leary that tvflash didn't work right despite
the assertion the flash operation completed:
iowa:~# tvflash -i
open_hca(0)
flash_chip_reset()
flash_check_failsafe()
 
Error. String Tag not present (found tag 50 instead)
HCA #0: Found MT23108, Cougar, revision A1
  Primary image is valid, unknown source (sig 0x0/0x0)
  Secondary image is valid, unknown source (sig 0x0/0x0)
      
Error. String Tag not present (found tag 50 instead)
close_hca()

Note that "tvflash -i" worked fine when the original firmware
was loaded:
HCA #0: Found MT23108, Cougar, revision A1
  Primary image is valid, unknown source (sig 0x0/0x0)
  Secondary image is valid, unknown source (sig 0x0/0x0)

  Vital Product Data
    Product Name: PCI-X Dual Port InfiniBand HCA
    P/N: AB286-60001          
    E/C: A-4412
    S/N: US4417F00350            
    Freq/Power: PW=15W;PCI 66MHZ;PCI-X 133MHZ
    Date Code: N/A
    Checksum: N/A


Maybe the firmware image I have is corrupt?
grundler at iowa:~$ cksum hca-cougar-a1-250-157.bin 
2761115387 932768 hca-cougar-a1-250-157.bin

Does tvflash to have some sanity checking (embedded
checksums or something) so it wouldn't use corrupted images?

I also tried a different image from HP (that also exposes the 3rd BAR).
Got the same result as above. Since the 3rd BAR is visible
and programmable, I'll assume the firmware downloaded was good
in both cases.

thanks,
grant



More information about the general mailing list