[openib-general] Causes of interrupt problems?

Troy Benjegerdes hozer at hozed.org
Sat Mar 19 14:31:33 PST 2005


On Fri, Mar 18, 2005 at 08:23:12PM -0800, Roland Dreier wrote:
>  > What would cause the following?
> 
>  > ib_mthca: Mellanox InfiniBand HCA driver v0.06-pre (November 8, 2004)
>  > ib_mthca: Initializing Mellanox Technology MT23108 InfiniHost (0000:04:00.0)
>  > ib_mthca 0000:04:00.0: NOP command failed to generate interrupt, aborting.
>  > ib_mthca 0000:04:00.0: BIOS or ACPI interrupt routing problem?
> 
>  > I've seen this on two Opteron systems, one Tyan board, one Rioworks
>  > HDAMA. Is there some bios setting I should look for? Things are working
>  > fine on another Rioworks HDAMA board.
> 
> It seems that the fact that the HCA appears as a PCI device with a
> huge BAR behind a PCI bridge confuses some BIOS/ACPI implementations.
> 
> Looking at that error message I realize it might be nice to be able to
> see what IRQ the driver is trying.  If you change the line in
> mthca_main.c that prints the error to something like
> 
> 		mthca_err(dev, "NOP command failed to generate interrupt (IRQ %d), aborting.\n",
> 			  dev->mthca_flags & MTHCA_FLAG_MSI_X ?
> 			  dev->eq_table.eq[MTHCA_EQ_CMD].msi_x_vector :
> 			  dev->pdev->irq);

Can you add this, as well as a check for recent firmware and/or card
revision?

I have some cards with ancient firmware revisions, which seem like they
don't implement NOP. The bios was actually fine on this machine, and
everything was happy once I put a card with a newer firmware in.

FYI, I've now got nfs over ipoib running, and I'm getting about 110-120
MB/sec read throughput from nfs using 'dd if=nfsfile of=/dev/null'.



More information about the general mailing list