From orion at nwra.com Wed Aug 5 14:42:27 2020 From: orion at nwra.com (Orion Poplawski) Date: Wed, 5 Aug 2020 15:42:27 -0600 Subject: [Users] Very slow to establish link Message-ID: Hello - I hope this is the right place to ask this. I have an HP with a Mellanox Connect-X 3 Pro dual port card: 04:00.0 Network controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro] Subsystem: Hewlett-Packard Company InfiniBand FDR/Ethernet 10Gb/40Gb 2-port 544+FLR-QSFP Adapter It takes a very long time to establish an IPoIB link: [ 10.511899] mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v4.0-0 [ 10.512979] mlx4_ib_add: counter index 0 for port 1 allocated 0 [ 10.512983] mlx4_ib_add: counter index 2 for port 2 allocated 1 [ 27.551020] ib0: enabling connected mode will cause multicast packet drops [ 27.551094] ib0: mtu > 4092 will cause multicast packet drops. [ 27.578065] IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready [ 54.659419] IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready This causes various issues with trying to have the system wait until the network is fully ready. Is there any way to speed things up? I've already set the link type to IB instead of VPI (saved 3-5 seconds). opensm.log from around the time of reboot: Aug 05 15:01:58 730288 [8FB89700] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:27 TID:0x0000136000000080 Aug 05 15:01:58 730516 [8FB89700] 0x02 -> SM class trap 128: Directed Path Dump of 1 hop path: Path = 0,1 Aug 05 15:01:58 730525 [8FB89700] 0x02 -> log_notice: Reporting Urgent Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720 Aug 05 15:01:58 747877 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches Aug 05 15:01:58 758108 [8E386700] 0x02 -> log_notice: Reporting Informational Notice "GID in service", GID:fe80::5065:f3ff:ff89:9db1 Aug 05 15:01:58 758119 [8E386700] 0x02 -> state_mgr_report_new_ports: Discovered new port with GUID:0x5065f3ffff899db1 LID range [23,23] of node: MT25408 ConnectX Mellanox Technologies Aug 05 15:01:58 758702 [8E386700] 0x02 -> SUBNET UP Aug 05 15:02:02 730114 [93390700] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:27 TID:0x0000136100000080 Aug 05 15:02:02 730173 [93390700] 0x02 -> SM class trap 128: Directed Path Dump of 1 hop path: Path = 0,1 Aug 05 15:02:02 730185 [93390700] 0x02 -> log_notice: Reporting Urgent Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720 Aug 05 15:02:02 746043 [8E386700] 0x02 -> log_notice: Reporting Informational Notice "GID out of service", GID:fe80::5065:f3ff:ff89:9db1 Aug 05 15:02:02 746058 [8E386700] 0x02 -> drop_mgr_remove_port: Removed port with GUID:0x5065f3ffff899db1 LID range [23, 23] of node:MT25408 ConnectX Mellanox Technologies Aug 05 15:02:02 746564 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches Aug 05 15:02:02 756371 [8E386700] 0x02 -> SUBNET UP Aug 05 15:02:06 729991 [91B8D700] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:27 TID:0x0000136200000080 Aug 05 15:02:06 730093 [91B8D700] 0x02 -> SM class trap 128: Directed Path Dump of 1 hop path: Path = 0,1 Aug 05 15:02:06 730105 [91B8D700] 0x02 -> log_notice: Reporting Urgent Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720 Aug 05 15:02:06 746537 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches Aug 05 15:02:06 756913 [8E386700] 0x02 -> log_notice: Reporting Informational Notice "GID in service", GID:fe80::5065:f3ff:ff89:9db1 Aug 05 15:02:06 756922 [8E386700] 0x02 -> state_mgr_report_new_ports: Discovered new port with GUID:0x5065f3ffff899db1 LID range [23,23] of node: MT25408 ConnectX Mellanox Technologies Aug 05 15:02:06 757297 [8E386700] 0x02 -> SUBNET UP Aug 05 15:02:29 729126 [93390700] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:27 TID:0x0000136300000080 Aug 05 15:02:29 729310 [93390700] 0x02 -> SM class trap 128: Directed Path Dump of 1 hop path: Path = 0,1 Aug 05 15:02:29 729319 [93390700] 0x02 -> log_notice: Reporting Urgent Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720 Aug 05 15:02:29 744846 [8E386700] 0x02 -> log_notice: Reporting Informational Notice "GID out of service", GID:fe80::5065:f3ff:ff89:9db1 Aug 05 15:02:29 744865 [8E386700] 0x02 -> drop_mgr_remove_port: Removed port with GUID:0x5065f3ffff899db1 LID range [23, 23] of node:MT25408 ConnectX Mellanox Technologies Aug 05 15:02:29 745424 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches Aug 05 15:02:29 755461 [8E386700] 0x02 -> SUBNET UP Aug 05 15:03:51 726042 [93B91700] 0x01 -> log_trap_info: Received Generic Notice type:1 num:128 (Link state change) Producer:2 (Switch) from LID:27 TID:0x0000136400000080 Aug 05 15:03:51 726187 [93B91700] 0x02 -> SM class trap 128: Directed Path Dump of 1 hop path: Path = 0,1 Aug 05 15:03:51 726195 [93B91700] 0x02 -> log_notice: Reporting Urgent Notice "Link state change" from switch LID 27, GUID 0xe41d2d0300064720 Aug 05 15:03:51 744988 [8E386700] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches Aug 05 15:03:51 757540 [8E386700] 0x02 -> log_notice: Reporting Informational Notice "GID in service", GID:fe80::5065:f3ff:ff89:9db1 Aug 05 15:03:51 757551 [8E386700] 0x02 -> state_mgr_report_new_ports: Discovered new port with GUID:0x5065f3ffff899db1 LID range [23,23] of node: hp25v3comp1 mlx4_0 Aug 05 15:03:51 758119 [8E386700] 0x02 -> SUBNET UP Aug 05 15:03:51 806275 [93390700] 0x01 -> log_trap_info: Received Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) Producer:1 (Channel Adapter) from LID:23 TID:0x0000000000000001 TIA, Orion -- Orion Poplawski Manager of NWRA Technical Systems 720-772-5637 NWRA, Boulder/CoRA Office FAX: 303-415-9702 3380 Mitchell Lane orion at nwra.com Boulder, CO 80301 https://www.nwra.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3799 bytes Desc: S/MIME Cryptographic Signature URL: