[openib-general] Help with CONFIG_PCI_MSI in the kernel

Grant Grundler iod00d at hp.com
Tue Apr 4 16:52:32 PDT 2006


On Tue, Apr 04, 2006 at 12:07:23PM -0600, Jason Gunthorpe wrote:
> On Tue, Apr 04, 2006 at 09:45:09AM -0700, Grant Grundler wrote:
> > > MSI requires end device support and something in a bridge
> > > to transform the MSI message into an APIC message, but the kernel
> > > currently only looks for end device support.
> > 
> > APIC Message?
> > MSI is just a DMA-write from the card point of view.
> > So if PCI is working and DMA is working, MSI should work too.
> > The difference is routing of the transaction and the fact that
> > it's not targeting Host memory but some other part of the chipset.
> 
> You still need chipset support to get from the memory write to an
> interrupt message transaction on the FSB to the processor APIC.
> http://www.intel.com/design/chipsets/datashts/30146403.pdf
> Pages 19 and 165

Yes, that's what I meant with "routing of the transaction".
"interrupt message transaction" is not "transformed" - just routed.
Page 19 uses the same language.

The "FSB Interrupt Memory Space" described on page 165 is also
known as the "PIB" (Processor Interrupt Block).
See:
	drivers/parisc/iosapic.c
	http://www.intel.com/design/itanium/downloads/251350.htm
	http://en.wikipedia.org/wiki/IO-APIC


The PIB can be anywhere and it's up to firmware to tell the OS
if it's not hardcoded in the kernel like it is today.
Hardcoding of the PIB is one of the problems that
Mark Maul is attempting to fix since it just doesn't
work for large SGI systems.

> I doubt all intel chipsets ever produced have this transformation.

I expect any Intel box that uses a LOCAL xAPIC _must_ support a PIB
(and therefore routing of MSI).

> I also know that alot of embedded systems don't have host bridges that
> support this.

Ok. Could that be because embedded systems aren't x86 or PCI-e based?
Is Fedora targeting embedded systems?
(I think not)

> The list of host bridges that work is definately smaller
> than the list that don't. :<

This thread started with x86-64 arch disabling MSI and I'm pretty
sure _all_ of those from Intel support this in HW.
It's clear one variant of x86-64 from AMD (8131) is broken and
others from Intel might be too. So for now, I'll disagree when
talking about x86-64 architecture.


> > Well, can't linux enable that block if it's present?
> > That isn't a reason to disable MSI for _all_ systems.
> 
> I have a patch that does that, but monkying with the memory map always
> makes me nervous since you never really know what the BIOS has done.
> Intel style MSI's overlap the high memory BIOS area so there is
> a potential problem. HT MSI translation can be configured to use a
> high address, so it might be very safe to enable the translation and
> set something >4G as the base.

This fits nicely with the patches that Mark Maule (SGI) has submitted
to make the hardcoding of PIB an x86 thing. Sounds like we have
another interested party in making the MSI support less Intel
specific.

> > Sorry...I can't agree.
> > Line based interrupt routing is dependent on firmware to give
> > us the IRQ->APIC routing tables, enough info to identify CPUs (ID/EID
> > info for Intel implementations) and program IO-xAPIC entries.
> > Essentially, MSI only needs the CPU Info so MSI transactions get
> > routed correctly. Then MSI/-X entries on the devices can be
> > programmed (essentially the same way an IO-xAPIC gets programmed).
> 
> ?? That's what I was trying to say - on a system with only PCIe
> (granted, with working MSI in the devices..) there should be limited
> need for IOAPIC based routing.

ah ok. But I was commenting on the dependency on firmware. It's still
there since we don't know which devices need IRQ lines and which can
live without them. The concept of "IRQ Lines" doesn't go away
with PCI-e. PCI-e just virtualizes the concept into transactions
that a parent bridge can convert into MSI transactions.
The parent bridge in this case is essentially doing the same
thing that an IO xAPIC (or SAPIC) would do.

> > If it's "just a BIOS" issue, then can't AMD help linux turn MSI support on?
> > ie linux can bang some values into the chipset like we do for other types
> > of initialization when BIOS doesn't do it right.
> 
> Yes, simple patch attached..

Nice. Looks good to me.
I'll bounce your mail linux-pci.
I've cc'd linux-pci on this reply.

thanks,
grant

> 
> Jason

> --- linux-2.6.15.4/drivers/pci/quirks.c	2006-02-16 12:08:59.000000000 -0700
> +++ lin/drivers/pci/quirks.c	2006-02-16 12:12:30.000000000 -0700
> @@ -1257,6 +1257,29 @@
>  }
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NCR, PCI_DEVICE_ID_NCR_53C810, fixup_rev1_53c810);
>  
> +#ifdef CONFIG_PCI_MSI
> +static void __devinit fixup_ht_msi(struct pci_dev* dev)
> +{
> +	/* Some BIOS's do not enable the hypertransport MSI mapping capability
> +	   on the chipset. This breaks MSI support.. */
> +	int pos = pci_find_capability(dev,PCI_CAP_ID_HT);
> +	while (pos != 0)
> +	{
> +		u32 cap;
> +		pci_read_config_dword(dev,pos,&cap);
> +		if (((cap >> 16) & PCI_HT_CMD_TYP) == PCI_HT_CMD_TYP_MSIM) {
> +			if ((cap & PCI_HT_MSIM_ENABLE) == 0) {
> +				printk("BIOS BUG: HyperTransport MSI mapping not enabled for %s, enabling.\n",pci_name(dev));
> +				cap |= PCI_HT_MSIM_ENABLE;
> +				pci_write_config_dword(dev,pos,cap);
> +			}
> +			break;
> +		}
> +		pos = pci_find_next_capability(dev, pos, PCI_CAP_ID_HT);
> +	}
> +}
> +DECLARE_PCI_FIXUP_FINAL(PCI_ANY_ID, PCI_ANY_ID, fixup_ht_msi);
> +#endif
>  
>  static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f, struct pci_fixup *end)
>  {
> --- linux-2.6.15.4/include/linux/pci_regs.h	2006-02-16 12:09:05.000000000 -0700
> +++ lin/include/linux/pci_regs.h	2006-02-16 12:12:30.000000000 -0700
> @@ -196,12 +196,14 @@
>  #define  PCI_CAP_ID_MSI		0x05	/* Message Signalled Interrupts */
>  #define  PCI_CAP_ID_CHSWP	0x06	/* CompactPCI HotSwap */
>  #define  PCI_CAP_ID_PCIX	0x07	/* PCI-X */
> +#define  PCI_CAP_ID_HT          0x08    /* HyperTransport */
>  #define  PCI_CAP_ID_SHPC 	0x0C	/* PCI Standard Hot-Plug Controller */
>  #define  PCI_CAP_ID_EXP 	0x10	/* PCI Express */
>  #define  PCI_CAP_ID_MSIX	0x11	/* MSI-X */
>  #define PCI_CAP_LIST_NEXT	1	/* Next capability in the list */
>  #define PCI_CAP_FLAGS		2	/* Capability defined flags (16 bits) */
>  #define PCI_CAP_SIZEOF		4
> +#define PCI_HT_CMD_TYP          0xf800  /* Hypertransport capability type mask */
>  
>  /* Power Management Registers */
>  
> @@ -285,6 +287,10 @@
>  #define PCI_MSI_DATA_64		12	/* 16 bits of data for 64-bit devices */
>  #define PCI_MSI_MASK_BIT	16	/* Mask bits register */
>  
> +/* HyperTransport MSI Mapping registers */
> +#define PCI_HT_CMD_TYP_MSIM 0xa800      // MSI Mapping type
> +#define PCI_HT_MSIM_ENABLE  (1<<16)
> +
>  /* CompactPCI Hotswap Register */
>  
>  #define PCI_CHSWP_CSR		2	/* Control and Status Register */




More information about the general mailing list