From mrobbert at mines.edu Wed Apr 14 11:42:01 2021 From: mrobbert at mines.edu (Michael Robbert) Date: Wed, 14 Apr 2021 18:42:01 +0000 Subject: [Users] Troubleshoot low LinkSpeed Message-ID: I just upgraded a couple of nodes in one of our clusters from CentOS 6 to CentOS 7 and after the upgrade the Infiniband connection dropped from QDR rates to DDR rates. I’m trying to figure out how to troubleshoot or fix this. If anybody has seen this and knows what is going on that would be great too. The 2 hosts in question have QLogic IBA7322 HCAs which is using the ib_qib driver. There are other hosts connected to the same switch that have the same HCA, but haven’t been upgraded from CentOS 6 to 7 yet and they are connecting at full QDR speeds. Ibstatus shows: Infiniband device 'qib0' port 1 status: default gid: fe80:0000:0000:0000:0011:7500:0070:2f82 base lid: 0xb4 sm lid: 0x115 state: 4: ACTIVE phys state: 5: LinkUp rate: 20 Gb/sec (4X DDR) link_layer: InfiniBand ibportstate shows an active link speed of 5.0 Gbps, but also shows that 10.0 Gbps is supported [root at compute128 ~]# ibportstate 180 1 query CA/RT PortInfo: # Port info: Lid 180 port 1 LinkState:.......................Active PhysLinkState:...................LinkUp Lid:.............................180 SMLid:...........................277 LMC:.............................0 LinkWidthSupported:..............1X or 4X LinkWidthEnabled:................1X or 4X LinkWidthActive:.................4X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedEnabled:................2.5 Gbps or 10.0 Gbps LinkSpeedActive:.................5.0 Gbps Mkey:............................ MkeyLeasePeriod:.................0 ProtectBits:.....................0 I have tried changing the speed with the ibportstate command, but it fails: [root at compute128 ~]# ibportstate 180 1 speed 10 Initial CA/RT PortInfo: # Port info: Lid 180 port 1 LinkState:.......................Active PhysLinkState:...................LinkUp Lid:.............................180 SMLid:...........................277 LMC:.............................0 LinkWidthSupported:..............1X or 4X LinkWidthEnabled:................1X or 4X LinkWidthActive:.................4X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedEnabled:................2.5 Gbps or 10.0 Gbps LinkSpeedActive:.................5.0 Gbps Mkey:............................ MkeyLeasePeriod:.................0 ProtectBits:.....................0 ibportstate: iberror: failed: smp set portinfo failed Any thoughts on how to troubleshoot or fix this would be appreciated. Thanks, Mike Robbert -------------- next part -------------- An HTML attachment was scrubbed... URL: From rsdance at soft-forge.com Wed Apr 14 12:01:54 2021 From: rsdance at soft-forge.com (Rupert Dance - SFI) Date: Wed, 14 Apr 2021 15:01:54 -0400 Subject: [Users] Troubleshoot low LinkSpeed In-Reply-To: References: Message-ID: <011401d73160$a2ce75e0$e86b61a0$@soft-forge.com> Firmware updates on the HCAs are a great place to start and solve 90% of these problems. Thanks Rupert From: Users On Behalf Of Michael Robbert Sent: Wednesday, April 14, 2021 2:42 PM To: users at lists.openfabrics.org Subject: [Users] Troubleshoot low LinkSpeed I just upgraded a couple of nodes in one of our clusters from CentOS 6 to CentOS 7 and after the upgrade the Infiniband connection dropped from QDR rates to DDR rates. I'm trying to figure out how to troubleshoot or fix this. If anybody has seen this and knows what is going on that would be great too. The 2 hosts in question have QLogic IBA7322 HCAs which is using the ib_qib driver. There are other hosts connected to the same switch that have the same HCA, but haven't been upgraded from CentOS 6 to 7 yet and they are connecting at full QDR speeds. Ibstatus shows: Infiniband device 'qib0' port 1 status: default gid: fe80:0000:0000:0000:0011:7500:0070:2f82 base lid: 0xb4 sm lid: 0x115 state: 4: ACTIVE phys state: 5: LinkUp rate: 20 Gb/sec (4X DDR) link_layer: InfiniBand ibportstate shows an active link speed of 5.0 Gbps, but also shows that 10.0 Gbps is supported [root at compute128 ~]# ibportstate 180 1 query CA/RT PortInfo: # Port info: Lid 180 port 1 LinkState:.......................Active PhysLinkState:...................LinkUp Lid:.............................180 SMLid:...........................277 LMC:.............................0 LinkWidthSupported:..............1X or 4X LinkWidthEnabled:................1X or 4X LinkWidthActive:.................4X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedEnabled:................2.5 Gbps or 10.0 Gbps LinkSpeedActive:.................5.0 Gbps Mkey:............................ MkeyLeasePeriod:.................0 ProtectBits:.....................0 I have tried changing the speed with the ibportstate command, but it fails: [root at compute128 ~]# ibportstate 180 1 speed 10 Initial CA/RT PortInfo: # Port info: Lid 180 port 1 LinkState:.......................Active PhysLinkState:...................LinkUp Lid:.............................180 SMLid:...........................277 LMC:.............................0 LinkWidthSupported:..............1X or 4X LinkWidthEnabled:................1X or 4X LinkWidthActive:.................4X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkSpeedEnabled:................2.5 Gbps or 10.0 Gbps LinkSpeedActive:.................5.0 Gbps Mkey:............................ MkeyLeasePeriod:.................0 ProtectBits:.....................0 ibportstate: iberror: failed: smp set portinfo failed Any thoughts on how to troubleshoot or fix this would be appreciated. Thanks, Mike Robbert -------------- next part -------------- An HTML attachment was scrubbed... URL: