[Users] Very slow to establish link

Orion Poplawski orion at nwra.com
Wed Aug 5 14:42:27 PDT 2020


Hello -

   I hope this is the right place to ask this.  I have an HP with a 
Mellanox Connect-X 3 Pro dual port card:

04:00.0 Network controller: Mellanox Technologies MT27520 Family 
[ConnectX-3 Pro]
         Subsystem: Hewlett-Packard Company InfiniBand FDR/Ethernet 
10Gb/40Gb 2-port 544+FLR-QSFP Adapter

It takes a very long time to establish an IPoIB link:

[   10.511899] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX 
InfiniBand driver v4.0-0
[   10.512979] <mlx4_ib> mlx4_ib_add: counter index 0 for port 1 allocated 0
[   10.512983] <mlx4_ib> mlx4_ib_add: counter index 2 for port 2 allocated 1
[   27.551020] ib0: enabling connected mode will cause multicast packet 
drops
[   27.551094] ib0: mtu > 4092 will cause multicast packet drops.
[   27.578065] IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready
[   54.659419] IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready


This causes various issues with trying to have the system wait until the 
network is fully ready.  Is there any way to speed things up?  I've 
already set the link type to IB instead of VPI (saved 3-5 seconds).

opensm.log from around the time of reboot:

Aug 05 15:01:58 730288 [8FB89700] 0x01 -> log_trap_info: Received 
Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) 
from LID:27 TID:0x0000136000000080
Aug 05 15:01:58 730516 [8FB89700] 0x02 -> SM class trap 128: Directed 
Path Dump of 1 hop path: Path = 0,1
Aug 05 15:01:58 730525 [8FB89700] 0x02 -> log_notice: Reporting Urgent 
Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720
Aug 05 15:01:58 747877 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop 
tables configured on all switches
Aug 05 15:01:58 758108 [8E386700] 0x02 -> log_notice: Reporting 
Informational Notice "GID in service", GID:fe80::5065:f3ff:ff89:9db1
Aug 05 15:01:58 758119 [8E386700] 0x02 -> state_mgr_report_new_ports: 
Discovered new port with GUID:0x5065f3ffff899db1 LID range [23,23] of 
node: MT25408 ConnectX Mellanox Technologies
Aug 05 15:01:58 758702 [8E386700] 0x02 -> SUBNET UP
Aug 05 15:02:02 730114 [93390700] 0x01 -> log_trap_info: Received 
Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) 
from LID:27 TID:0x0000136100000080
Aug 05 15:02:02 730173 [93390700] 0x02 -> SM class trap 128: Directed 
Path Dump of 1 hop path: Path = 0,1
Aug 05 15:02:02 730185 [93390700] 0x02 -> log_notice: Reporting Urgent 
Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720
Aug 05 15:02:02 746043 [8E386700] 0x02 -> log_notice: Reporting 
Informational Notice "GID out of service", GID:fe80::5065:f3ff:ff89:9db1
Aug 05 15:02:02 746058 [8E386700] 0x02 -> drop_mgr_remove_port: Removed 
port with GUID:0x5065f3ffff899db1 LID range [23, 23] of node:MT25408 
ConnectX Mellanox Technologies
Aug 05 15:02:02 746564 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop 
tables configured on all switches
Aug 05 15:02:02 756371 [8E386700] 0x02 -> SUBNET UP
Aug 05 15:02:06 729991 [91B8D700] 0x01 -> log_trap_info: Received 
Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) 
from LID:27 TID:0x0000136200000080
Aug 05 15:02:06 730093 [91B8D700] 0x02 -> SM class trap 128: Directed 
Path Dump of 1 hop path: Path = 0,1
Aug 05 15:02:06 730105 [91B8D700] 0x02 -> log_notice: Reporting Urgent 
Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720
Aug 05 15:02:06 746537 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop 
tables configured on all switches
Aug 05 15:02:06 756913 [8E386700] 0x02 -> log_notice: Reporting 
Informational Notice "GID in service", GID:fe80::5065:f3ff:ff89:9db1
Aug 05 15:02:06 756922 [8E386700] 0x02 -> state_mgr_report_new_ports: 
Discovered new port with GUID:0x5065f3ffff899db1 LID range [23,23] of 
node: MT25408 ConnectX Mellanox Technologies
Aug 05 15:02:06 757297 [8E386700] 0x02 -> SUBNET UP
Aug 05 15:02:29 729126 [93390700] 0x01 -> log_trap_info: Received 
Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) 
from LID:27 TID:0x0000136300000080
Aug 05 15:02:29 729310 [93390700] 0x02 -> SM class trap 128: Directed 
Path Dump of 1 hop path: Path = 0,1
Aug 05 15:02:29 729319 [93390700] 0x02 -> log_notice: Reporting Urgent 
Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720
Aug 05 15:02:29 744846 [8E386700] 0x02 -> log_notice: Reporting 
Informational Notice "GID out of service", GID:fe80::5065:f3ff:ff89:9db1
Aug 05 15:02:29 744865 [8E386700] 0x02 -> drop_mgr_remove_port: Removed 
port with GUID:0x5065f3ffff899db1 LID range [23, 23] of node:MT25408 
ConnectX Mellanox Technologies
Aug 05 15:02:29 745424 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop 
tables configured on all switches
Aug 05 15:02:29 755461 [8E386700] 0x02 -> SUBNET UP
Aug 05 15:03:51 726042 [93B91700] 0x01 -> log_trap_info: Received 
Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) 
from LID:27 TID:0x0000136400000080
Aug 05 15:03:51 726187 [93B91700] 0x02 -> SM class trap 128: Directed 
Path Dump of 1 hop path: Path = 0,1
Aug 05 15:03:51 726195 [93B91700] 0x02 -> log_notice: Reporting Urgent 
Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720
Aug 05 15:03:51 744988 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop 
tables configured on all switches
Aug 05 15:03:51 757540 [8E386700] 0x02 -> log_notice: Reporting 
Informational Notice "GID in service", GID:fe80::5065:f3ff:ff89:9db1
Aug 05 15:03:51 757551 [8E386700] 0x02 -> state_mgr_report_new_ports: 
Discovered new port with GUID:0x5065f3ffff899db1 LID range [23,23] of 
node: hp25v3comp1 mlx4_0
Aug 05 15:03:51 758119 [8E386700] 0x02 -> SUBNET UP
Aug 05 15:03:51 806275 [93390700] 0x01 -> log_trap_info: Received 
Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link 
[Width|Speed] Enabled, SM priority changed) Producer:1 (Channel Adapter) 
from LID:23 TID:0x0000000000000001

TIA,
   Orion

-- 
Orion Poplawski
Manager of NWRA Technical Systems          720-772-5637
NWRA, Boulder/CoRA Office             FAX: 303-415-9702
3380 Mitchell Lane                       orion at nwra.com
Boulder, CO 80301                 https://www.nwra.com/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3799 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20200805/ebdb1e7b/attachment.bin>


More information about the Users mailing list