[openib-general] mthca startup problem

Roland Dreier roland at topspin.com
Tue Sep 21 11:59:59 PDT 2004


    hal> I have been running with acpi=off on my boot line (and also
    hal> disabled ACPI in the BIOS). Is that what the BIOS update
    hal> would be for ?

If you've always had acpi=off on your bootline then getting rid of it
and turning ACPI on may be helpful (I'm not sure what "have been
running" means exactly).

The kernel needs to find out how the interrupt lines are wired from
the PCI slots to the interrupt controller.  ACPI information from the
BIOS is the most modern way, but BIOS vendors frequently have bugs.
The virtual PCI bridge in the HCA often confuses the BIOS.  That's why
a BIOS update may be helpful.

    hal> In any case, I have tried numerous combinations and still
    hal> can't seem to get that first interrupt from the HCA. How do
    hal> I go about debugging this ?

If you've ever had an HCA driver working on this machine, you can look
at /proc/interrupts and see what IRQ the HCA is assigned.  Then add a
printk to mthca_eq.c before request_irq (the second, non-MSIX
request_irq) and see what the value of dev->pdev->irq is.  If they're
different then that's the problem.

If you've never had the HCA working then you can put a different PCI
card in the same slot as the HCA and see what IRQ it gets, and compare
that to the IRQ assigned to the HCA.

Assuming the wrong IRQ is being assigned to the HCA, then you need to
work with your BIOS vendor and/or the Linux ACPI maintainers to fix
things up.

There's also a slim chance that turning on CONFIG_PCI_MSI may help.
(However make sure you don't enable msi or msi_x for ib_mthca --
MSI/MSI-X doesn't work on any current Opteron systems).

 - R.



More information about the general mailing list