[ewg] Lustre + MPI traffic congestion free

Atul Yadav atulyadavtech at gmail.com
Mon Apr 14 09:52:28 PDT 2014


Deal All

We are trying to run  lustre + MPI traffic on common infiniband.
I am  sharing the full details of the cluster with the purpose.

Lustre

   - mds1
   - mds2
   - oss1
   - oss2

Compute Node

   - Nalanda
   - compute-0-1 to compute-0-34

Topology
Ftree is configured with the help of yours. 5 switch

So, we are using common infiniband cable for Lustre and MPI traffic.

Can i make sure my Lustre traffic work without any congestion.

Guide us please...

Thanks in advance
Atul yadav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20140414/f7b9febe/attachment.html>
-------------- next part --------------
-------------------------------------------------
OpenSM 3.3.5
 Reading Cached Option File: /etc/rdma/opensm.conf
 Loading Cached Option:guid = 0x0002c9030042e421
 Loading Cached Option:sweep_interval = 120
 Loading Cached Option:routing_engine = ftree
 Loading Cached Option:use_ucast_cache = TRUE
 Loading Cached Option:root_guid_file = /etc/rdma/guid
Command Line Arguments:
 Daemon mode
 Log File: /var/log/opensm.log
-------------------------------------------------
OpenSM 3.3.5

Apr 12 20:44:15 982804 [67F8C700] 0x80 -> OpenSM 3.3.5
-------------------------------------------------
OpenSM 3.3.5
 Reading Cached Option File: /etc/rdma/opensm.conf
 Loading Cached Option:guid = 0x0002c9030042e421
 Loading Cached Option:sweep_interval = 120
 Loading Cached Option:routing_engine = ftree
 Loading Cached Option:use_ucast_cache = TRUE
 Loading Cached Option:root_guid_file = /etc/rdma/guid
Command Line Arguments:
 Daemon mode
 Log File: /var/log/opensm.log
-------------------------------------------------
OpenSM 3.3.5

Apr 12 20:44:15 982804 [67F8C700] 0x80 -> OpenSM 3.3.5
Entering DISCOVERING state

Apr 12 20:44:15 984514 [67F8C700] 0x02 -> osm_vendor_init: 1000 pending umads specified
Apr 12 20:44:15 984702 [67F8C700] 0x80 -> Entering DISCOVERING state
Entering MASTER state

Apr 12 20:44:15 984761 [67F8C700] 0x02 -> osm_vendor_bind: Binding to port 0x2c9030042e421
Apr 12 20:44:16 027506 [67F8C700] 0x02 -> osm_vendor_bind: Binding to port 0x2c9030042e421
Apr 12 20:44:16 027558 [67F8C700] 0x02 -> osm_opensm_bind: Setting IS_SM on port 0x0002c9030042e421
Apr 12 20:44:16 069014 [5CB78700] 0x80 -> Entering MASTER state
SUBNET UP

Apr 12 20:44:16 075363 [5CB78700] 0x02 -> fabric_dump_general_info: General fabric topology info
Apr 12 20:44:16 075368 [5CB78700] 0x02 -> fabric_dump_general_info: ============================
Apr 12 20:44:16 075371 [5CB78700] 0x02 -> fabric_dump_general_info:   - FatTree rank (roots to leaf switches): 2
Apr 12 20:44:16 075372 [5CB78700] 0x02 -> fabric_dump_general_info:   - FatTree max switch rank: 1
Apr 12 20:44:16 075374 [5CB78700] 0x02 -> fabric_dump_general_info:   - Fabric has 39 CAs, 39 CA ports (39 of them CNs), 5 switches
Apr 12 20:44:16 075376 [5CB78700] 0x02 -> fabric_dump_general_info:   - Fabric has 2 switches at rank 0 (roots)
Apr 12 20:44:16 075378 [5CB78700] 0x02 -> fabric_dump_general_info:   - Fabric has 3 switches at rank 1 (3 of them leafs)
Apr 12 20:44:16 075511 [5CB78700] 0x02 -> osm_ucast_mgr_process: ftree tables configured on all switches
Apr 12 20:44:16 098151 [5CB78700] 0x80 -> SUBNET UP
Apr 12 20:44:16 277047 [6077E700] 0x01 -> log_trap_info: Received Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) Producer:1 (Channel Adapter) from LID:1 TID:0x000000000000003c
Apr 12 20:44:16 277090 [6077E700] 0x02 -> trap_rcv_process_request: Trap 144 Node description update
Apr 12 20:44:16 277105 [6077E700] 0x02 -> log_notice: Reporting Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) from LID:1 GID:fe80::2:c903:42:e421
Apr 12 20:44:16 298708 [5CB78700] 0x02 -> osm_ucast_cache_process: Configuring switch tables using cached routing
Apr 12 20:44:16 299811 [5CB78700] 0x02 -> SUBNET UP
Apr 12 20:44:18 072914 [61B80700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:401b:ffff::ffff:ffff
Apr 12 20:44:18 073968 [5E97B700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff85:6d01
Apr 12 20:44:18 074165 [5F37C700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e421
Apr 12 20:44:18 074791 [64384700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:401b:ffff::1
Apr 12 20:44:18 074855 [67589700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff85:cc21
Apr 12 20:44:18 074905 [61B80700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff85:cc31
Apr 12 20:44:18 074940 [6117F700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff52:498f
Apr 12 20:44:18 075018 [5DF7A700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff52:4983
Apr 12 20:44:18 075062 [62F82700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:401b:ffff::fb
Apr 12 20:44:18 075126 [5DF7A700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff41:f801
Apr 12 20:44:18 075561 [5FD7D700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff52:4993
Apr 12 20:44:18 075653 [62581700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1
Apr 12 20:44:18 076163 [5E97B700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:3a11
Apr 12 20:44:18 076192 [5E97B700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff52:499b
Apr 12 20:44:18 076299 [6117F700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e301
Apr 12 20:44:18 076331 [6117F700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::202
Apr 12 20:44:18 076354 [6117F700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff41:f841
Apr 12 20:44:18 076419 [65786700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e321
Apr 12 20:44:18 076851 [61B80700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e331
Apr 12 20:44:18 076932 [62F82700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff85:cc61
Apr 12 20:44:18 077120 [62581700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e441
Apr 12 20:44:18 077399 [66B88700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e511
Apr 12 20:44:18 077635 [64D85700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:6


-------------- next part --------------
A non-text attachment was scrubbed...
Name: Cluster IB.jpg
Type: image/jpeg
Size: 89451 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20140414/f7b9febe/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: opensm.conf
Type: application/octet-stream
Size: 8436 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20140414/f7b9febe/attachment.obj>
-------------- next part --------------
[root at nalanda mvapich2-1.9]# ibnetdiscover
#
# Topology file: generated on Sat Apr 12 16:22:11 2014
#
# Initiated from node 0002c9030042e420 port 0002c9030042e421

vendid=0x2c9
devid=0xc738
sysimgguid=0x2c903008fd500
switchguid=0x2c903008fd500(2c903008fd500)
Switch  36 "S-0002c903008fd500"         # "SwitchX -  Mellanox Technologies" base port 0 lid 4 lmc 0
[1]     "S-0002c90300901480"[10]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[2]     "S-0002c90300901480"[11]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[3]     "S-0002c90300901480"[12]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[4]     "S-0002c90300901480"[13]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[5]     "S-0002c90300901480"[14]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[6]     "S-0002c90300901480"[15]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[7]     "S-0002c90300901480"[16]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[8]     "S-0002c90300901480"[17]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[9]     "S-0002c90300901480"[18]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[10]    "S-0002c90300902c00"[10]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[11]    "S-0002c90300902c00"[11]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[12]    "S-0002c90300902c00"[12]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[13]    "S-0002c90300902c00"[13]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[14]    "S-0002c90300902c00"[14]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[15]    "S-0002c90300902c00"[15]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[16]    "S-0002c90300902c00"[16]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[17]    "S-0002c90300902c00"[17]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[18]    "S-0002c90300902c00"[18]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[19]    "H-0002c9030042e440"[1](2c9030042e441)          # "compute-0-11 HCA-1" lid 19 4xQDR
[20]    "H-0002c9030042e300"[1](2c9030042e301)          # "compute-0-10 HCA-1" lid 6 4xQDR
[21]    "H-0002c9030042e390"[1](2c9030042e391)          # "compute-0-7 HCA-1" lid 32 4xQDR
[22]    "H-0002c9030042e450"[1](2c9030042e451)          # "compute-0-6 HCA-1" lid 22 4xQDR
[23]    "H-0002c9030042e380"[1](2c9030042e381)          # "compute-0-13 HCA-1" lid 28 4xQDR
[24]    "H-0002c9030042e360"[1](2c9030042e361)          # "compute-0-14 HCA-1" lid 24 4xQDR
[25]    "H-0002c9030042e320"[1](2c9030042e321)          # "compute-0-9 HCA-1" lid 14 4xQDR
[26]    "H-0002c9030042e3e0"[1](2c9030042e3e1)          # "compute-0-8 HCA-1" lid 38 4xQDR
[27]    "H-0002c9030042e330"[1](2c9030042e331)          # "compute-0-2 HCA-1" lid 17 4xQDR
[28]    "H-24be05ffff856df0"[1](24be05ffff856df1)               # "compute-0-29 HCA-1" lid 39 4xQDR
[29]    "H-0002c90300423a10"[1](2c90300423a11)          # "compute-0-32 HCA-1" lid 9 4xQDR
[30]    "H-24be05ffff85bc90"[1](24be05ffff85bc91)               # "compute-0-23 HCA-1" lid 31 4xQDR
[31]    "H-24be05ffff85bce0"[1](24be05ffff85bce1)               # "compute-0-24 HCA-1" lid 36 4xQDR

vendid=0x2c9
devid=0xc738
sysimgguid=0x2c90300903600
switchguid=0x2c90300903600(2c90300903600)
Switch  36 "S-0002c90300903600"         # "SwitchX -  Mellanox Technologies" base port 0 lid 3 lmc 0
[1]     "S-0002c90300902c00"[1]         # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[2]     "S-0002c90300902c00"[2]         # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[3]     "S-0002c90300902c00"[3]         # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[4]     "S-0002c90300902c00"[4]         # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[5]     "S-0002c90300902c00"[5]         # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[6]     "S-0002c90300902c00"[6]         # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[7]     "S-0002c90300902c00"[7]         # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[8]     "S-0002c90300902c00"[8]         # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[9]     "S-0002c90300902c00"[9]         # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[10]    "S-0002c90300901480"[19]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[11]    "S-0002c90300901480"[20]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[12]    "S-0002c90300901480"[21]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[13]    "S-0002c90300901480"[22]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[14]    "S-0002c90300901480"[23]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[15]    "S-0002c90300901480"[24]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[16]    "S-0002c90300901480"[25]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[17]    "S-0002c90300901480"[26]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[18]    "S-0002c90300901480"[27]                # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[19]    "H-0002c90300524982"[1](2c90300524983)          # "oss2 mlx4_0" lid 29 4xQDR
[20]    "H-0002c90300524992"[1](2c90300524993)          # "mds2 mlx4_0" lid 33 4xQDR
[21]    "H-0002c9030052499a"[1](2c9030052499b)          # "mds1 mlx4_0" lid 34 4xQDR
[22]    "H-0002c9030052498e"[1](2c9030052498f)          # "oss1 mlx4_0" lid 30 4xQDR
[23]    "H-24be05ffff85cc20"[1](24be05ffff85cc21)               # "compute-0-28 HCA-1" lid 13 4xQDR
[24]    "H-24be05ffff85bcf0"[1](24be05ffff85bcf1)               # "compute-0-26 HCA-1" lid 40 4xQDR
[25]    "H-24be05ffff85cc30"[1](24be05ffff85cc31)               # "compute-0-30 HCA-1" lid 16 4xQDR
[26]    "H-0002c9030041f800"[1](2c9030041f801)          # "compute-0-31 HCA-1" lid 8 4xQDR
[27]    "H-24be05ffff856d00"[1](24be05ffff856d01)               # "compute-0-27 HCA-1" lid 5 4xQDR
[28]    "H-24be05ffff85cc60"[1](24be05ffff85cc61)               # "compute-0-25 HCA-1" lid 23 4xQDR
[29]    "H-0002c9030041f7f0"[1](2c9030041f7f1)          # "compute-0-34 HCA-1" lid 44 4xQDR
[30]    "H-0002c9030041f840"[1](2c9030041f841)          # "compute-0-33 HCA-1" lid 20 4xQDR

vendid=0x2c9
devid=0xc738
sysimgguid=0x2c90300902c00
switchguid=0x2c90300902c00(2c90300902c00)
Switch  36 "S-0002c90300902c00"         # "SwitchX -  Mellanox Technologies" base port 0 lid 2 lmc 0
[1]     "S-0002c90300903600"[1]         # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[2]     "S-0002c90300903600"[2]         # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[3]     "S-0002c90300903600"[3]         # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[4]     "S-0002c90300903600"[4]         # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[5]     "S-0002c90300903600"[5]         # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[6]     "S-0002c90300903600"[6]         # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[7]     "S-0002c90300903600"[7]         # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[8]     "S-0002c90300903600"[8]         # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[9]     "S-0002c90300903600"[9]         # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[10]    "S-0002c903008fd500"[10]                # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[11]    "S-0002c903008fd500"[11]                # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[12]    "S-0002c903008fd500"[12]                # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[13]    "S-0002c903008fd500"[13]                # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[14]    "S-0002c903008fd500"[14]                # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[15]    "S-0002c903008fd500"[15]                # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[16]    "S-0002c903008fd500"[16]                # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[17]    "S-0002c903008fd500"[17]                # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[18]    "S-0002c903008fd500"[18]                # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[19]    "S-0002c9030075c870"[10]                # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[20]    "S-0002c9030075c870"[11]                # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[21]    "S-0002c9030075c870"[12]                # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[23]    "S-0002c9030075c870"[14]                # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[24]    "S-0002c9030075c870"[15]                # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[25]    "S-0002c9030075c870"[16]                # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[26]    "S-0002c9030075c870"[17]                # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[27]    "S-0002c9030075c870"[18]                # "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0xc738
sysimgguid=0x2c90300901480
switchguid=0x2c90300901480(2c90300901480)
Switch  36 "S-0002c90300901480"         # "SwitchX -  Mellanox Technologies" base port 0 lid 27 lmc 0
[1]     "S-0002c9030075c870"[1]         # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[2]     "S-0002c9030075c870"[2]         # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[3]     "S-0002c9030075c870"[3]         # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[4]     "S-0002c9030075c870"[4]         # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[5]     "S-0002c9030075c870"[5]         # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[6]     "S-0002c9030075c870"[6]         # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[7]     "S-0002c9030075c870"[7]         # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[8]     "S-0002c9030075c870"[8]         # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[9]     "S-0002c9030075c870"[9]         # "SwitchX -  Mellanox Technologies" lid 25 4xQDR
[10]    "S-0002c903008fd500"[1]         # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[11]    "S-0002c903008fd500"[2]         # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[12]    "S-0002c903008fd500"[3]         # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[13]    "S-0002c903008fd500"[4]         # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[14]    "S-0002c903008fd500"[5]         # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[15]    "S-0002c903008fd500"[6]         # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[16]    "S-0002c903008fd500"[7]         # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[17]    "S-0002c903008fd500"[8]         # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[18]    "S-0002c903008fd500"[9]         # "SwitchX -  Mellanox Technologies" lid 4 4xQDR
[19]    "S-0002c90300903600"[10]                # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[20]    "S-0002c90300903600"[11]                # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[21]    "S-0002c90300903600"[12]                # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[22]    "S-0002c90300903600"[13]                # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[23]    "S-0002c90300903600"[14]                # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[24]    "S-0002c90300903600"[15]                # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[25]    "S-0002c90300903600"[16]                # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[26]    "S-0002c90300903600"[17]                # "SwitchX -  Mellanox Technologies" lid 3 4xQDR
[27]    "S-0002c90300903600"[18]                # "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0xc738
sysimgguid=0x2c9030075c870
switchguid=0x2c9030075c870(2c9030075c870)
Switch  36 "S-0002c9030075c870"         # "SwitchX -  Mellanox Technologies" base port 0 lid 25 lmc 0
[1]     "S-0002c90300901480"[1]         # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[2]     "S-0002c90300901480"[2]         # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[3]     "S-0002c90300901480"[3]         # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[4]     "S-0002c90300901480"[4]         # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[5]     "S-0002c90300901480"[5]         # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[6]     "S-0002c90300901480"[6]         # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[7]     "S-0002c90300901480"[7]         # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[8]     "S-0002c90300901480"[8]         # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[9]     "S-0002c90300901480"[9]         # "SwitchX -  Mellanox Technologies" lid 27 4xQDR
[10]    "S-0002c90300902c00"[19]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[11]    "S-0002c90300902c00"[20]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[12]    "S-0002c90300902c00"[21]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[14]    "S-0002c90300902c00"[23]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[15]    "S-0002c90300902c00"[24]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[16]    "S-0002c90300902c00"[25]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[17]    "S-0002c90300902c00"[26]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[18]    "S-0002c90300902c00"[27]                # "SwitchX -  Mellanox Technologies" lid 2 4xQDR
[19]    "H-0002c9030042e250"[1](2c9030042e251)          # "compute-0-20 HCA-1" lid 21 4xQDR
[20]    "H-0002c9030042e310"[1](2c9030042e311)          # "compute-0-17 HCA-1" lid 10 4xQDR
[21]    "H-0002c9030042e4f0"[1](2c9030042e4f1)          # "compute-0-3 HCA-1" lid 43 4xQDR
[22]    "H-0002c9030042e2e0"[1](2c9030042e2e1)          # "compute-0-19 HCA-1" lid 37 4xQDR
[23]    "H-0002c9030042e520"[1](2c9030042e521)          # "compute-0-18 HCA-1" lid 15 4xQDR
[24]    "H-0002c9030042e3f0"[1](2c9030042e3f1)          # "compute-0-22 HCA-1" lid 42 4xQDR
[25]    "H-0002c9030042e400"[1](2c9030042e401)          # "compute-0-1 HCA-1" lid 7 4xQDR
[26]    "H-0002c9030042e430"[1](2c9030042e431)          # "compute-0-15 HCA-1" lid 18 4xQDR
[27]    "H-0002c9030042e420"[1](2c9030042e421)          # "nalanda HCA-1" lid 1 4xQDR
[28]    "H-0002c9030042e510"[1](2c9030042e511)          # "compute-0-21 HCA-1" lid 12 4xQDR
[29]    "H-0002c9030042e470"[1](2c9030042e471)          # "compute-0-16 HCA-1" lid 26 4xQDR
[30]    "H-0002c9030042e2f0"[1](2c9030042e2f1)          # "compute-0-12 HCA-1" lid 41 4xQDR
[31]    "H-0002c9030042e410"[1](2c9030042e411)          # "compute-0-5 HCA-1" lid 11 4xQDR
[32]    "H-0002c9030042e2d0"[1](2c9030042e2d1)          # "compute-0-4 HCA-1" lid 35 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030041f843
caguid=0x2c9030041f840
Ca      2 "H-0002c9030041f840"          # "compute-0-33 HCA-1"
[1](2c9030041f841)      "S-0002c90300903600"[30]                # lid 20 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030041f7f3
caguid=0x2c9030041f7f0
Ca      2 "H-0002c9030041f7f0"          # "compute-0-34 HCA-1"
[1](2c9030041f7f1)      "S-0002c90300903600"[29]                # lid 44 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85cc63
caguid=0x24be05ffff85cc60
Ca      2 "H-24be05ffff85cc60"          # "compute-0-25 HCA-1"
[1](24be05ffff85cc61)   "S-0002c90300903600"[28]                # lid 23 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff856d03
caguid=0x24be05ffff856d00
Ca      2 "H-24be05ffff856d00"          # "compute-0-27 HCA-1"
[1](24be05ffff856d01)   "S-0002c90300903600"[27]                # lid 5 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030041f803
caguid=0x2c9030041f800
Ca      2 "H-0002c9030041f800"          # "compute-0-31 HCA-1"
[1](2c9030041f801)      "S-0002c90300903600"[26]                # lid 8 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85cc33
caguid=0x24be05ffff85cc30
Ca      2 "H-24be05ffff85cc30"          # "compute-0-30 HCA-1"
[1](24be05ffff85cc31)   "S-0002c90300903600"[25]                # lid 16 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85bcf3
caguid=0x24be05ffff85bcf0
Ca      2 "H-24be05ffff85bcf0"          # "compute-0-26 HCA-1"
[1](24be05ffff85bcf1)   "S-0002c90300903600"[24]                # lid 40 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85cc23
caguid=0x24be05ffff85cc20
Ca      2 "H-24be05ffff85cc20"          # "compute-0-28 HCA-1"
[1](24be05ffff85cc21)   "S-0002c90300903600"[23]                # lid 13 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x673c
sysimgguid=0x2c90300524991
caguid=0x2c9030052498e
Ca      2 "H-0002c9030052498e"          # "oss1 mlx4_0"
[1](2c9030052498f)      "S-0002c90300903600"[22]                # lid 30 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x673c
sysimgguid=0x2c9030052499d
caguid=0x2c9030052499a
Ca      2 "H-0002c9030052499a"          # "mds1 mlx4_0"
[1](2c9030052499b)      "S-0002c90300903600"[21]                # lid 34 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x673c
sysimgguid=0x2c90300524995
caguid=0x2c90300524992
Ca      2 "H-0002c90300524992"          # "mds2 mlx4_0"
[1](2c90300524993)      "S-0002c90300903600"[20]                # lid 33 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x673c
sysimgguid=0x2c90300524985
caguid=0x2c90300524982
Ca      2 "H-0002c90300524982"          # "oss2 mlx4_0"
[1](2c90300524983)      "S-0002c90300903600"[19]                # lid 29 lmc 0 "SwitchX -  Mellanox Technologies" lid 3 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85bce3
caguid=0x24be05ffff85bce0
Ca      2 "H-24be05ffff85bce0"          # "compute-0-24 HCA-1"
[1](24be05ffff85bce1)   "S-0002c903008fd500"[31]                # lid 36 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85bc93
caguid=0x24be05ffff85bc90
Ca      2 "H-24be05ffff85bc90"          # "compute-0-23 HCA-1"
[1](24be05ffff85bc91)   "S-0002c903008fd500"[30]                # lid 31 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c90300423a13
caguid=0x2c90300423a10
Ca      2 "H-0002c90300423a10"          # "compute-0-32 HCA-1"
[1](2c90300423a11)      "S-0002c903008fd500"[29]                # lid 9 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff856df3
caguid=0x24be05ffff856df0
Ca      2 "H-24be05ffff856df0"          # "compute-0-29 HCA-1"
[1](24be05ffff856df1)   "S-0002c903008fd500"[28]                # lid 39 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e333
caguid=0x2c9030042e330
Ca      2 "H-0002c9030042e330"          # "compute-0-2 HCA-1"
[1](2c9030042e331)      "S-0002c903008fd500"[27]                # lid 17 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e3e3
caguid=0x2c9030042e3e0
Ca      2 "H-0002c9030042e3e0"          # "compute-0-8 HCA-1"
[1](2c9030042e3e1)      "S-0002c903008fd500"[26]                # lid 38 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e323
caguid=0x2c9030042e320
Ca      2 "H-0002c9030042e320"          # "compute-0-9 HCA-1"
[1](2c9030042e321)      "S-0002c903008fd500"[25]                # lid 14 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e363
caguid=0x2c9030042e360
Ca      2 "H-0002c9030042e360"          # "compute-0-14 HCA-1"
[1](2c9030042e361)      "S-0002c903008fd500"[24]                # lid 24 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e383
caguid=0x2c9030042e380
Ca      2 "H-0002c9030042e380"          # "compute-0-13 HCA-1"
[1](2c9030042e381)      "S-0002c903008fd500"[23]                # lid 28 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e453
caguid=0x2c9030042e450
Ca      2 "H-0002c9030042e450"          # "compute-0-6 HCA-1"
[1](2c9030042e451)      "S-0002c903008fd500"[22]                # lid 22 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e393
caguid=0x2c9030042e390
Ca      2 "H-0002c9030042e390"          # "compute-0-7 HCA-1"
[1](2c9030042e391)      "S-0002c903008fd500"[21]                # lid 32 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e303
caguid=0x2c9030042e300
Ca      2 "H-0002c9030042e300"          # "compute-0-10 HCA-1"
[1](2c9030042e301)      "S-0002c903008fd500"[20]                # lid 6 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e443
caguid=0x2c9030042e440
Ca      2 "H-0002c9030042e440"          # "compute-0-11 HCA-1"
[1](2c9030042e441)      "S-0002c903008fd500"[19]                # lid 19 lmc 0 "SwitchX -  Mellanox Technologies" lid 4 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e2d3
caguid=0x2c9030042e2d0
Ca      2 "H-0002c9030042e2d0"          # "compute-0-4 HCA-1"
[1](2c9030042e2d1)      "S-0002c9030075c870"[32]                # lid 35 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e413
caguid=0x2c9030042e410
Ca      2 "H-0002c9030042e410"          # "compute-0-5 HCA-1"
[1](2c9030042e411)      "S-0002c9030075c870"[31]                # lid 11 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e2f3
caguid=0x2c9030042e2f0
Ca      2 "H-0002c9030042e2f0"          # "compute-0-12 HCA-1"
[1](2c9030042e2f1)      "S-0002c9030075c870"[30]                # lid 41 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e473
caguid=0x2c9030042e470
Ca      2 "H-0002c9030042e470"          # "compute-0-16 HCA-1"
[1](2c9030042e471)      "S-0002c9030075c870"[29]                # lid 26 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e513
caguid=0x2c9030042e510
Ca      2 "H-0002c9030042e510"          # "compute-0-21 HCA-1"
[1](2c9030042e511)      "S-0002c9030075c870"[28]                # lid 12 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e433
caguid=0x2c9030042e430
Ca      2 "H-0002c9030042e430"          # "compute-0-15 HCA-1"
[1](2c9030042e431)      "S-0002c9030075c870"[26]                # lid 18 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e403
caguid=0x2c9030042e400
Ca      2 "H-0002c9030042e400"          # "compute-0-1 HCA-1"
[1](2c9030042e401)      "S-0002c9030075c870"[25]                # lid 7 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e3f3
caguid=0x2c9030042e3f0
Ca      2 "H-0002c9030042e3f0"          # "compute-0-22 HCA-1"
[1](2c9030042e3f1)      "S-0002c9030075c870"[24]                # lid 42 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e523
caguid=0x2c9030042e520
Ca      2 "H-0002c9030042e520"          # "compute-0-18 HCA-1"
[1](2c9030042e521)      "S-0002c9030075c870"[23]                # lid 15 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e2e3
caguid=0x2c9030042e2e0
Ca      2 "H-0002c9030042e2e0"          # "compute-0-19 HCA-1"
[1](2c9030042e2e1)      "S-0002c9030075c870"[22]                # lid 37 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e4f3
caguid=0x2c9030042e4f0
Ca      2 "H-0002c9030042e4f0"          # "compute-0-3 HCA-1"
[1](2c9030042e4f1)      "S-0002c9030075c870"[21]                # lid 43 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e313
caguid=0x2c9030042e310
Ca      2 "H-0002c9030042e310"          # "compute-0-17 HCA-1"
[1](2c9030042e311)      "S-0002c9030075c870"[20]                # lid 10 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e253
caguid=0x2c9030042e250
Ca      2 "H-0002c9030042e250"          # "compute-0-20 HCA-1"
[1](2c9030042e251)      "S-0002c9030075c870"[19]                # lid 21 lmc 0 "SwitchX -  Mellanox Technologies" lid 25 4xQDR

vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e423
caguid=0x2c9030042e420
Ca      2 "H-0002c9030042e420"          # "nalanda HCA-1"
[1](2c9030042e421)      "S-0002c9030075c870"[27]   

[root at nalanda mvapich2-1.9]# ibhosts
Ca      : 0x0002c9030041f840 ports 2 "compute-0-33 HCA-1"
Ca      : 0x0002c9030041f7f0 ports 2 "compute-0-34 HCA-1"
Ca      : 0x24be05ffff85cc60 ports 2 "compute-0-25 HCA-1"
Ca      : 0x24be05ffff856d00 ports 2 "compute-0-27 HCA-1"
Ca      : 0x0002c9030041f800 ports 2 "compute-0-31 HCA-1"
Ca      : 0x24be05ffff85cc30 ports 2 "compute-0-30 HCA-1"
Ca      : 0x24be05ffff85bcf0 ports 2 "compute-0-26 HCA-1"
Ca      : 0x24be05ffff85cc20 ports 2 "compute-0-28 HCA-1"
Ca      : 0x0002c9030052498e ports 2 "oss1 mlx4_0"
Ca      : 0x0002c9030052499a ports 2 "mds1 mlx4_0"
Ca      : 0x0002c90300524992 ports 2 "mds2 mlx4_0"
Ca      : 0x0002c90300524982 ports 2 "oss2 mlx4_0"
Ca      : 0x24be05ffff85bce0 ports 2 "compute-0-24 HCA-1"
Ca      : 0x24be05ffff85bc90 ports 2 "compute-0-23 HCA-1"
Ca      : 0x0002c90300423a10 ports 2 "compute-0-32 HCA-1"
Ca      : 0x24be05ffff856df0 ports 2 "compute-0-29 HCA-1"
Ca      : 0x0002c9030042e330 ports 2 "compute-0-2 HCA-1"
Ca      : 0x0002c9030042e3e0 ports 2 "compute-0-8 HCA-1"
Ca      : 0x0002c9030042e320 ports 2 "compute-0-9 HCA-1"
Ca      : 0x0002c9030042e360 ports 2 "compute-0-14 HCA-1"
Ca      : 0x0002c9030042e380 ports 2 "compute-0-13 HCA-1"
Ca      : 0x0002c9030042e450 ports 2 "compute-0-6 HCA-1"
Ca      : 0x0002c9030042e390 ports 2 "compute-0-7 HCA-1"
Ca      : 0x0002c9030042e300 ports 2 "compute-0-10 HCA-1"
Ca      : 0x0002c9030042e440 ports 2 "compute-0-11 HCA-1"
Ca      : 0x0002c9030042e2d0 ports 2 "compute-0-4 HCA-1"
Ca      : 0x0002c9030042e410 ports 2 "compute-0-5 HCA-1"
Ca      : 0x0002c9030042e2f0 ports 2 "compute-0-12 HCA-1"
Ca      : 0x0002c9030042e470 ports 2 "compute-0-16 HCA-1"
Ca      : 0x0002c9030042e510 ports 2 "compute-0-21 HCA-1"
Ca      : 0x0002c9030042e430 ports 2 "compute-0-15 HCA-1"
Ca      : 0x0002c9030042e400 ports 2 "compute-0-1 HCA-1"
Ca      : 0x0002c9030042e3f0 ports 2 "compute-0-22 HCA-1"
Ca      : 0x0002c9030042e520 ports 2 "compute-0-18 HCA-1"
Ca      : 0x0002c9030042e2e0 ports 2 "compute-0-19 HCA-1"
Ca      : 0x0002c9030042e4f0 ports 2 "compute-0-3 HCA-1"
Ca      : 0x0002c9030042e310 ports 2 "compute-0-17 HCA-1"
Ca      : 0x0002c9030042e250 ports 2 "compute-0-20 HCA-1"
Ca      : 0x0002c9030042e420 ports 2 "nalanda HCA-1"
[root at nalanda mvapich2-1.9]#
[root at nalanda mvapich2-1.9]# ibswitches
Switch  : 0x0002c903008fd500 ports 36 "SwitchX -  Mellanox Technologies" base port 0 lid 4 lmc 0
Switch  : 0x0002c90300903600 ports 36 "SwitchX -  Mellanox Technologies" base port 0 lid 3 lmc 0
Switch  : 0x0002c90300902c00 ports 36 "SwitchX -  Mellanox Technologies" base port 0 lid 2 lmc 0
Switch  : 0x0002c90300901480 ports 36 "SwitchX -  Mellanox Technologies" base port 0 lid 27 lmc 0
Switch  : 0x0002c9030075c870 ports 36 "SwitchX -  Mellanox Technologies" base port 0 lid 25 lmc 0
[root at nalanda mvapich2-1.9]#






More information about the ewg mailing list