[openib-general] ib_mthca "NOP command failed to generate interrupt (IRQ 169), aborting."

Don.Albert at Bull.com Don.Albert at Bull.com
Tue Mar 14 13:48:50 PST 2006


Roland and Michael,

Roland Dreier <rdreier at cisco.com> wrote on 03/14/2006 10:11:26 AM:

>  > >  - Try passing the option "fw_cmd_doorbell=0" when loading 
ib_mthca.
>  > >    Again, it shouldn't matter but maybe it does.
>  > 
>  > I tried passing the option first, since it was the simplest and 
quickest
>  > thing to try.  It worked.   The module now loads successfully.
> 
> Hmm, that's interesting.  I wonder why that caused problems with the
> NOP command interrupt test.
> 
> Michael, the situation here is that Arbel with FW 4.6.0 (I know it's
> old) fails with the latest mthca because the NOP test doesn't generate
> an event.  However, disabling fw_cmd_doorbell makes it work again.
> Do you know if this is a FW issue?
> 
> Don, it would be nice if you could try a newer firmware to see if that
> works without the fw_cmd_doorbell option.
> 
>  - R.

I went looking for new firmware for this HCA. The OpenIB Wiki entry on 
updating the Mellanox firmware says to check the board revision by doing 

    "cat /sys/class/infiniband/mthca0/board_id"

which output nothing.  An "od" on the file shows only '0000000a' hex. 

Next I tried to find the board-id with the the "mstflint" tool as 
described in the "installation cheat sheet", but when I tried to build it 
under 'gen2/trunk/src/userspace/mstflint' I got a number of compile 
errors. Evidently this system's installation of gcc 3.4.3 is missing 
several 'c++' include files.  I will have to investigate that further.

So I next built the "tvflash" utility under 
'gen2/trunk/src/userspace/tvflash'.  I was able to use this to display 
some information about the board, but got an error:

[root at koa tvflash]# /usr/local/sbin/tvflash -i
HCA #0: Found MT25208 (MT23108 mode), Lion Cub, revision A0 (firmware 
autoupgrade)
  Primary image is v4.06.0000 build 3.0.0.160, with label 'HCA.LionCub.A0'
  Secondary image is v4.05.0000 build 2.0.0.572, with label 
'HCA.LionCub.A0'


Error. String Tag not present (found tag 4c instead)
  Vital Product Data
[root at koa tvflash]#

We have another identical system with the same type HCA, except that it is 
running the Cisco/Topspin 3.2.0 release stack instead of OpenIB.  The 
"tvflash" utility on that system gives:

[jatoba] (root) root> /usr/local/topspin/sbin/tvflash -i
HCA #0: MT25208 Tavor Compat, Lion Cub, revision A0
  Primary image is v4.7.400 build 3.2.0.67, with label 'HCA.LionCub.A0'
  Secondary image is v4.6.0 build 3.0.0.160, with label 'HCA.LionCub.A0'

  Vital Product Data
    Product Name: Lion cub
    P/N: 99-00026-01
    E/C: Rev: B00
    S/N: TS0448F00407
    Freq/Power: PW=10W;PCIe 8X
    Date Code: 0448
    Checksum: Ok
[jatoba] (root) root>

Lacking the Board-ID (PSID), I went to the Mellanox firmware support site 
and tried to determine what firmware I needed based on the information I 
did have (i.e. Lion Cub, PCI Express, 25208) and I think that I need to 
load the following firmware:

fw-25208-4_7_600-MHEL-CF128-T.bin.gz

Michael, can you confirm this, based on what I have described above?

Given the error I encountered trying to read the Vital Product Data above, 
will I be able to use the tvflash utility I built from the gen2 sources to 
succesfully update the board?

Does it look like my original problem is really the firmware?

-Don Albert-
Bull HN Information Systems
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060314/ec19885d/attachment.html>


More information about the general mailing list