[openib-general] ib_mthca "NOP command failed to generate interrupt (IRQ 169), aborting."

Michael S. Tsirkin mst at mellanox.co.il
Tue Mar 14 14:09:56 PST 2006


Quoting r. Don.Albert at bull.com <Don.Albert at bull.com>:
> Subject: Re: [openib-general] ib_mthca "NOP command failed to generate?interrupt (IRQ 169), aborting."
> 
> 
> Roland and Michael,
> 
> Roland Dreier <rdreier at cisco.com> wrote on 03/14/2006 10:11:26 AM:
> 
> >  > >  - Try passing the option "fw_cmd_doorbell=0" when loading ib_mthca.
> >  > >    Again, it shouldn't matter but maybe it does.
> >  >
> >  > I tried passing the option first, since it was the simplest and quickest
> >  > thing to try.  It worked.   The module now loads successfully.
> >
> > Hmm, that's interesting.  I wonder why that caused problems with the
> > NOP command interrupt test.
> >
> > Michael, the situation here is that Arbel with FW 4.6.0 (I know it's
> > old) fails with the latest mthca because the NOP test doesn't generate
> > an event.  However, disabling fw_cmd_doorbell makes it work again.
> > Do you know if this is a FW issue?
> >
> > Don, it would be nice if you could try a newer firmware to see if that
> > works without the fw_cmd_doorbell option.
> >
> >  - R.
> 
> I went looking for new firmware for this HCA. The OpenIB Wiki entry on updating the Mellanox firmware says to check the board revision by doing
> 
>     "cat /sys/class/infiniband/mthca0/board_id"
> 
> which output nothing.  An "od" on the file shows only '0000000a' hex.  

Thats a newline, or something? Maybe the card was burned before we came up
with the PSID thing.

> 
> Next I tried to find the board-id with the the "mstflint" tool as described in the "installation cheat sheet", but when I tried to build it under 'gen2/trunk/src/userspace/mstflint' I got a number of compile errors. Evidently this system's installation of gcc 3.4.3 is missing several 'c++' include files.  I will have to investigate that further.

Try building on another linux and moving it over: the default build
links libstdc++ in statically, so the binary is very portable.

> So I next built the "tvflash" utility under 'gen2/trunk/src/userspace/tvflash'.  I was able to use this to display some information about the board, but got an error:
> 
> [root at koa tvflash]# /usr/local/sbin/tvflash -i
> HCA #0: Found MT25208 (MT23108 mode), Lion Cub, revision A0 (firmware autoupgrade)
>   Primary image is v4.06.0000 build 3.0.0.160, with label 'HCA.LionCub.A0'
>   Secondary image is v4.05.0000 build 2.0.0.572, with label 'HCA.LionCub.A0'
> 
> 
> Error. String Tag not present (found tag 4c instead)
>   Vital Product Data
> [root at koa tvflash]#
> 
> We have another identical system with the same type HCA, except that it is running the Cisco/Topspin 3.2.0 release stack instead of OpenIB.  The "tvflash" utility on that system gives:
>
> [jatoba] (root) root> /usr/local/topspin/sbin/tvflash -i
> HCA #0: MT25208 Tavor Compat, Lion Cub, revision A0
>   Primary image is v4.7.400 build 3.2.0.67, with label 'HCA.LionCub.A0'
>   Secondary image is v4.6.0 build 3.0.0.160, with label 'HCA.LionCub.A0'
> 
>   Vital Product Data
>     Product Name: Lion cub
>     P/N: 99-00026-01
>     E/C: Rev: B00
>     S/N: TS0448F00407
>     Freq/Power: PW=10W;PCIe 8X
>     Date Code: 0448
>     Checksum: Ok
> [jatoba] (root) root>

So, the primary image is v4.7.400 and this means it will be running v4.7.400.

> Lacking the Board-ID (PSID), I went to the Mellanox firmware support site and tried to determine what firmware I needed based on the information I did have (i.e. Lion Cub, PCI Express, 25208) and I think that I need to load the following firmware:
> 
> fw-25208-4_7_600-MHEL-CF128-T.bin.gz
> 
> Michael, can you confirm this, based on what I have described above?
> 
> Given the error I encountered trying to read the Vital Product Data above, will I be able to use the tvflash utility I built from the gen2 sources to succesfully update the board?
> 
> Does it look like my original problem is really the firmware?
> 
> -Don Albert-
> Bull HN Information Systems

I'll try to find out tomorrow.

Meanwhile you can read the image off the working card with flint "ri" and 
then burn it back or to another card.

-- 
Michael S. Tsirkin
Staff Engineer, Mellanox Technologies



More information about the general mailing list