[openib-general] Re: Continue to experience problems in installing Gen2 on IA-32
Weikuan Yu
yuw at cse.ohio-state.edu
Thu Aug 11 15:07:18 PDT 2005
Hi,
Thanks for your suggestions and help.
At the end of this email, I have included the output from our system
when enabling CONFIG_INFINIBAND_MTHCA_DEBUG=y. Note that there are
additional four lines of warning message during the initiation of the
device. These are generated from init_port() function, due to the
incorrect return status of a command to the firmware, INIT_IB.
We were suspicious of some of the INIT_IB flags or other parameters
could have gone wrong, or have mismatches between our firmware and the
gen2 code. So I went ahead and hacked on some of the INIT_IB
parameters. At the end, it turns out that this patch could solve the
problem on our system.
[yuw at p3 hw]$ svn diff mthca/
Index: mthca/mthca_qp.c
===================================================================
--- mthca/mthca_qp.c (revision 2986)
+++ mthca/mthca_qp.c (working copy)
@@ -575,7 +575,7 @@
memset(¶m, 0, sizeof param);
- param.enable_1x = 1;
+ param.enable_1x = 0;
param.enable_4x = 1;
param.vl_cap = dev->limits.vl_cap;
param.mtu_cap = dev->limits.mtu_cap;
So this suggests that the current code is trying to enable the device
to do both 1x and 4x communication, which is not compatible with the
firmware parameters we chose. Anyhow, this solves our problem. We are
now running the gen2 code fine as tested with provided test programs,
e.g., ibv_rc_pingpong. We will be happy to provide additional
information if needed. BTW, we are using firmware 3.3.2 for tavor
cards.
As always, your suggestions and help are greatly appreciated.
--Weikuan
+++++++++ dmesg output ++++++++++++++
ib_mthca: Mellanox InfiniBand HCA driver v0.06 (June 23, 2005)
ib_mthca: Initializing (0000:02:00.0)
ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 26 (level, low) -> IRQ 185
ib_mthca 0000:02:00.0: Found bridge: (0000:01:02.0)
ib_mthca 0000:02:00.0: FW version 000300030002, max commands 64
ib_mthca 0000:02:00.0: FW size 6143 KB (start bfa00000, end bfffffff)
ib_mthca 0000:02:00.0: HCA memory size 131071 KB (start b8000000, end
bfffffff)
ib_mthca 0000:02:00.0: Max QPs: 16777216, reserved QPs: 1024, entry
size: 256
ib_mthca 0000:02:00.0: Max SRQs: 1024, reserved SRQs: 16, entry size: 32
ib_mthca 0000:02:00.0: Max CQs: 16777216, reserved CQs: 128, entry
size: 64
ib_mthca 0000:02:00.0: Max EQs: 64, reserved EQs: 1, entry size: 64
ib_mthca 0000:02:00.0: reserved MPTs: 16, reserved MTTs: 16
ib_mthca 0000:02:00.0: Max PDs: 16777216, reserved PDs: 0, reserved
UARs: 1
ib_mthca 0000:02:00.0: Max QP/MCG: 16777216, reserved MGMs: 0
ib_mthca 0000:02:00.0: Flags: 00370347
ib_mthca 0000:02:00.0: profile[ 0]--10/20 @ 0x b8000000 (size 0x
4000000)
ib_mthca 0000:02:00.0: profile[ 1]-- 0/16 @ 0x bc000000 (size 0x
1000000)
ib_mthca 0000:02:00.0: profile[ 2]-- 7/18 @ 0x bd000000 (size 0x
800000)
ib_mthca 0000:02:00.0: profile[ 3]-- 9/17 @ 0x bd800000 (size 0x
800000)
ib_mthca 0000:02:00.0: profile[ 4]-- 3/16 @ 0x be000000 (size 0x
400000)
ib_mthca 0000:02:00.0: profile[ 5]-- 4/16 @ 0x be400000 (size 0x
200000)
ib_mthca 0000:02:00.0: profile[ 6]--12/15 @ 0x be600000 (size 0x
100000)
ib_mthca 0000:02:00.0: profile[ 7]-- 8/13 @ 0x be700000 (size 0x
80000)
ib_mthca 0000:02:00.0: profile[ 8]--11/11 @ 0x be780000 (size 0x
10000)
ib_mthca 0000:02:00.0: profile[ 9]-- 2/10 @ 0x be790000 (size 0x
8000)
ib_mthca 0000:02:00.0: profile[10]-- 6/ 5 @ 0x be798000 (size 0x
800)
ib_mthca 0000:02:00.0: HCA memory: allocated 106082 KB/124928 KB (18846
KB free)
ib_mthca 0000:02:00.0: Allocated EQ 1 with 65536 entries
ib_mthca 0000:02:00.0: Allocated EQ 2 with 128 entries
ib_mthca 0000:02:00.0: Allocated EQ 3 with 128 entries
ib_mthca 0000:02:00.0: Setting mask 00000000000f43fe for eqn 2
ib_mthca 0000:02:00.0: Setting mask 0000000000000400 for eqn 3
ib_mthca 0000:02:00.0: NOP command IRQ test passed
ib_mthca 0000:02:00.0: Command 09 completed with status 03
ib_mthca 0000:02:00.0: INIT_IB returned status 03.
ib_mthca 0000:02:00.0: Command 09 completed with status 03
ib_mthca 0000:02:00.0: INIT_IB returned status 03.
On Aug 11, 2005, at 3:32 PM, Dhabaleswar Panda wrote:
> Hal, Roland and James,
>
> Many thanks for your prompt replies!!
>
> We tried with the debug option. Thanks for this suggestion.
>
> It looks like one of the parameters (1X/4X) parameter for the card is
> not being set properly on the IA-32 system which is leading to the
> `disable' state for the card. By manually changing this parameter to
> 4X, one of the nodes is able to detect the card. We are trying this on
> other nodes. Not sure whether this is coming out because of the driver
> or the firmware in the card. We are looking into this further. One of
> my students will soon post all the details.
>
> Thanks again for all your help!!
>
> DK
>
>> Dhabaleswar> Opetron systems and carry out experiments. There is
>> Dhabaleswar> no problem. The problem is coming only for IA-32
>> Dhabaleswar> systems. Even on EM64T systems, this problem comes
>> Dhabaleswar> when operating it in IA-32 mode.
>>
>> Out of curiousity, do PCIe cards work with 32-bit kernels?
>>
>> As Hal said, please post the kernel log you get when loading drivers
>> built with CONFIG_INFINIBAND_MTHCA_DEBUG=y.
>>
>> Thanks,
>> Roland
>>
>
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list