[openib-general] Causes of interrupt problems?
Troy Benjegerdes
hozer at hozed.org
Sat Mar 19 14:31:33 PST 2005
On Fri, Mar 18, 2005 at 08:23:12PM -0800, Roland Dreier wrote:
> > What would cause the following?
>
> > ib_mthca: Mellanox InfiniBand HCA driver v0.06-pre (November 8, 2004)
> > ib_mthca: Initializing Mellanox Technology MT23108 InfiniHost (0000:04:00.0)
> > ib_mthca 0000:04:00.0: NOP command failed to generate interrupt, aborting.
> > ib_mthca 0000:04:00.0: BIOS or ACPI interrupt routing problem?
>
> > I've seen this on two Opteron systems, one Tyan board, one Rioworks
> > HDAMA. Is there some bios setting I should look for? Things are working
> > fine on another Rioworks HDAMA board.
>
> It seems that the fact that the HCA appears as a PCI device with a
> huge BAR behind a PCI bridge confuses some BIOS/ACPI implementations.
>
> Looking at that error message I realize it might be nice to be able to
> see what IRQ the driver is trying. If you change the line in
> mthca_main.c that prints the error to something like
>
> mthca_err(dev, "NOP command failed to generate interrupt (IRQ %d), aborting.\n",
> dev->mthca_flags & MTHCA_FLAG_MSI_X ?
> dev->eq_table.eq[MTHCA_EQ_CMD].msi_x_vector :
> dev->pdev->irq);
Can you add this, as well as a check for recent firmware and/or card
revision?
I have some cards with ancient firmware revisions, which seem like they
don't implement NOP. The bios was actually fine on this machine, and
everything was happy once I put a card with a newer firmware in.
FYI, I've now got nfs over ipoib running, and I'm getting about 110-120
MB/sec read throughput from nfs using 'dd if=nfsfile of=/dev/null'.
More information about the general
mailing list