[openib-general] ip over ib throughtput

Grant Grundler iod00d at hp.com
Thu Jan 13 14:29:31 PST 2005


On Thu, Jan 13, 2005 at 11:28:08AM -0800, Roland Dreier wrote:
>     Andras> # dmesg -c > /dev/null && modprobe ib_mthca
>     Andras> [ some minutes pass ]
>     Andras> # dmesg
> 
> Hmm, based on this, it seems as if the HCA's interrupt is being
> misrouted -- it looks like the driver may be failing as soon as it
> tries an interrupt-driven FW command.

That's the impression I had but wasn't sure - thanks for confirming.

>  Grant, any idea on what could be going wrong here?

Works For Me (tm). I've only tested with later (3.1.0 and 3.2.0)
versions of firmware.  Upgrading the firmware would be my first
action. 

In general, I've tested 2.6.10 quite a bit and haven't exposed
any problems with line based IRQs. Trust me, we would have heard
about it pretty quickly on ia64-linux mailing list.

> Is the fact that vector == irq # OK?

Erm, I think that's a brainfart.

ib_mthca: Initializing Mellanox Technology MT23108 InfiniHost (0000:41:00.0)
GSI 38 (level, low) -> CPU 1 (0x0100) vector 58
ACPI: PCI interrupt 0000:41:00.0[A] -> GSI 38 (level, low) -> IRQ 58

This looks normal to me. Interpretation: (I might have this wrong)
o IRQ routing from INTA to IO SAPIC is hard coded.
o IO SAPIC IRTE is programmed with GSI 38 (doesn't say which CPU)
o We call it "IRQ 58" (different IRQs might share one GSI vector
  and are used by different interrupt sources)

> Then it seems there is a bug in the IOSAPIC code causing the stack
> dump when mthca tries to disable the device in its failure path.

It's not clear yet the WARN_ON is a bug in IOSAPIC code.
I'm waiting for ACPI folks to sort that out.

> Based on all the trouble I've seen with interrupt routing on various
> systems, I'm thinking it's worth adding a test to the driver to make
> sure that it can receive interrupts...

Hrm...only if the test completes quickly on success.

TBH, I would expect IRQ lines to work on *ALL* platforms.
Basic part of PCI compliance dictates this.

It's clear not all platforms support MSI and the failure mode
may vary. So a test for MSI or MSI_X would be good.
And then the code can warn "MSI[-X] Failed - falling back to IRQ Line".

grant



More information about the general mailing list