[ofa-general] Folks is this a known problem / already fixed ?
Eli Cohen
eli at dev.mellanox.co.il
Sat May 17 11:27:44 PDT 2008
On Fri, 2008-05-16 at 14:28 -0400, Richard Frank wrote:
> We see the following failure for our ConnetX HCAs.. with 1.3.1 Daily
> 20080512 done on vanilla OEL5U1.
>
> They are failing to initialize with the following:
>
> mlx4_core: Mellanox ConnectX core driver v1.0 (February 28, 2008)
> mlx4_core: Initializing 0000:05:00.0
> mlx4_core 0000:05:00.0: Failed to initialize queue pair table, aborting.
> mlx4_core 0000:05:00.0: Failed to initialize queue pair table, aborting.
> mlx4_core: probe of 0000:05:00.0 failed with error -16
>
> And lspci shows:
>
> 05:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR] (rev
> a0)
> Subsystem: Mellanox Technologies MT25418 [ConnectX IB DDR]
> Flags: fast devsel, IRQ 169
> Memory at fcc00000 (64-bit, non-prefetchable) [disabled] [size=1M]
> Memory at fff000000 (64-bit, prefetchable) [disabled] [size=8M]
> Memory at fcbfe000 (64-bit, non-prefetchable) [disabled] [size=8K]
> Capabilities: [40] Power Management version 3
> Capabilities: [48] Vital Product Data
> Capabilities: [9c] MSI-X: Enable- Mask- TabSize=256
> Capabilities: [60] Express Endpoint IRQ 0
>
Can you send the output of lspci for the bridge connecting the ConnectX
with the upstream PCI bus? I guess the problem would be that the bridge
blocks memory writes to ConnectX's UAR area thus causing a failure to
arm the EQ and eventually resulting in failure to load the driver. Now
it could be a failure of the kernel to configure the bridge properly.
Could you try with the latest kernel?
More information about the general
mailing list