[ewg] Lustre + MPI traffic congestion free
Atul Yadav
atulyadavtech at gmail.com
Mon Apr 14 09:52:28 PDT 2014
Deal All
We are trying to run lustre + MPI traffic on common infiniband.
I am sharing the full details of the cluster with the purpose.
Lustre
- mds1
- mds2
- oss1
- oss2
Compute Node
- Nalanda
- compute-0-1 to compute-0-34
Topology
Ftree is configured with the help of yours. 5 switch
So, we are using common infiniband cable for Lustre and MPI traffic.
Can i make sure my Lustre traffic work without any congestion.
Guide us please...
Thanks in advance
Atul yadav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20140414/f7b9febe/attachment.html>
-------------- next part --------------
-------------------------------------------------
OpenSM 3.3.5
Reading Cached Option File: /etc/rdma/opensm.conf
Loading Cached Option:guid = 0x0002c9030042e421
Loading Cached Option:sweep_interval = 120
Loading Cached Option:routing_engine = ftree
Loading Cached Option:use_ucast_cache = TRUE
Loading Cached Option:root_guid_file = /etc/rdma/guid
Command Line Arguments:
Daemon mode
Log File: /var/log/opensm.log
-------------------------------------------------
OpenSM 3.3.5
Apr 12 20:44:15 982804 [67F8C700] 0x80 -> OpenSM 3.3.5
-------------------------------------------------
OpenSM 3.3.5
Reading Cached Option File: /etc/rdma/opensm.conf
Loading Cached Option:guid = 0x0002c9030042e421
Loading Cached Option:sweep_interval = 120
Loading Cached Option:routing_engine = ftree
Loading Cached Option:use_ucast_cache = TRUE
Loading Cached Option:root_guid_file = /etc/rdma/guid
Command Line Arguments:
Daemon mode
Log File: /var/log/opensm.log
-------------------------------------------------
OpenSM 3.3.5
Apr 12 20:44:15 982804 [67F8C700] 0x80 -> OpenSM 3.3.5
Entering DISCOVERING state
Apr 12 20:44:15 984514 [67F8C700] 0x02 -> osm_vendor_init: 1000 pending umads specified
Apr 12 20:44:15 984702 [67F8C700] 0x80 -> Entering DISCOVERING state
Entering MASTER state
Apr 12 20:44:15 984761 [67F8C700] 0x02 -> osm_vendor_bind: Binding to port 0x2c9030042e421
Apr 12 20:44:16 027506 [67F8C700] 0x02 -> osm_vendor_bind: Binding to port 0x2c9030042e421
Apr 12 20:44:16 027558 [67F8C700] 0x02 -> osm_opensm_bind: Setting IS_SM on port 0x0002c9030042e421
Apr 12 20:44:16 069014 [5CB78700] 0x80 -> Entering MASTER state
SUBNET UP
Apr 12 20:44:16 075363 [5CB78700] 0x02 -> fabric_dump_general_info: General fabric topology info
Apr 12 20:44:16 075368 [5CB78700] 0x02 -> fabric_dump_general_info: ============================
Apr 12 20:44:16 075371 [5CB78700] 0x02 -> fabric_dump_general_info: - FatTree rank (roots to leaf switches): 2
Apr 12 20:44:16 075372 [5CB78700] 0x02 -> fabric_dump_general_info: - FatTree max switch rank: 1
Apr 12 20:44:16 075374 [5CB78700] 0x02 -> fabric_dump_general_info: - Fabric has 39 CAs, 39 CA ports (39 of them CNs), 5 switches
Apr 12 20:44:16 075376 [5CB78700] 0x02 -> fabric_dump_general_info: - Fabric has 2 switches at rank 0 (roots)
Apr 12 20:44:16 075378 [5CB78700] 0x02 -> fabric_dump_general_info: - Fabric has 3 switches at rank 1 (3 of them leafs)
Apr 12 20:44:16 075511 [5CB78700] 0x02 -> osm_ucast_mgr_process: ftree tables configured on all switches
Apr 12 20:44:16 098151 [5CB78700] 0x80 -> SUBNET UP
Apr 12 20:44:16 277047 [6077E700] 0x01 -> log_trap_info: Received Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) Producer:1 (Channel Adapter) from LID:1 TID:0x000000000000003c
Apr 12 20:44:16 277090 [6077E700] 0x02 -> trap_rcv_process_request: Trap 144 Node description update
Apr 12 20:44:16 277105 [6077E700] 0x02 -> log_notice: Reporting Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) from LID:1 GID:fe80::2:c903:42:e421
Apr 12 20:44:16 298708 [5CB78700] 0x02 -> osm_ucast_cache_process: Configuring switch tables using cached routing
Apr 12 20:44:16 299811 [5CB78700] 0x02 -> SUBNET UP
Apr 12 20:44:18 072914 [61B80700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:401b:ffff::ffff:ffff
Apr 12 20:44:18 073968 [5E97B700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff85:6d01
Apr 12 20:44:18 074165 [5F37C700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e421
Apr 12 20:44:18 074791 [64384700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:401b:ffff::1
Apr 12 20:44:18 074855 [67589700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff85:cc21
Apr 12 20:44:18 074905 [61B80700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff85:cc31
Apr 12 20:44:18 074940 [6117F700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff52:498f
Apr 12 20:44:18 075018 [5DF7A700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff52:4983
Apr 12 20:44:18 075062 [62F82700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:401b:ffff::fb
Apr 12 20:44:18 075126 [5DF7A700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff41:f801
Apr 12 20:44:18 075561 [5FD7D700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff52:4993
Apr 12 20:44:18 075653 [62581700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1
Apr 12 20:44:18 076163 [5E97B700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:3a11
Apr 12 20:44:18 076192 [5E97B700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff52:499b
Apr 12 20:44:18 076299 [6117F700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e301
Apr 12 20:44:18 076331 [6117F700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::202
Apr 12 20:44:18 076354 [6117F700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff41:f841
Apr 12 20:44:18 076419 [65786700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e321
Apr 12 20:44:18 076851 [61B80700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e331
Apr 12 20:44:18 076932 [62F82700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff85:cc61
Apr 12 20:44:18 077120 [62581700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e441
Apr 12 20:44:18 077399 [66B88700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff42:e511
Apr 12 20:44:18 077635 [64D85700] 0x02 -> log_notice: Reporting Generic Notice type:3 num:6
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Cluster IB.jpg
Type: image/jpeg
Size: 89451 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20140414/f7b9febe/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: opensm.conf
Type: application/octet-stream
Size: 8436 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20140414/f7b9febe/attachment.obj>
-------------- next part --------------
[root at nalanda mvapich2-1.9]# ibnetdiscover
#
# Topology file: generated on Sat Apr 12 16:22:11 2014
#
# Initiated from node 0002c9030042e420 port 0002c9030042e421
vendid=0x2c9
devid=0xc738
sysimgguid=0x2c903008fd500
switchguid=0x2c903008fd500(2c903008fd500)
Switch 36 "S-0002c903008fd500" # "SwitchX - Mellanox Technologies" base port 0 lid 4 lmc 0
[1] "S-0002c90300901480"[10] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[2] "S-0002c90300901480"[11] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[3] "S-0002c90300901480"[12] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[4] "S-0002c90300901480"[13] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[5] "S-0002c90300901480"[14] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[6] "S-0002c90300901480"[15] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[7] "S-0002c90300901480"[16] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[8] "S-0002c90300901480"[17] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[9] "S-0002c90300901480"[18] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[10] "S-0002c90300902c00"[10] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[11] "S-0002c90300902c00"[11] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[12] "S-0002c90300902c00"[12] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[13] "S-0002c90300902c00"[13] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[14] "S-0002c90300902c00"[14] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[15] "S-0002c90300902c00"[15] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[16] "S-0002c90300902c00"[16] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[17] "S-0002c90300902c00"[17] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[18] "S-0002c90300902c00"[18] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[19] "H-0002c9030042e440"[1](2c9030042e441) # "compute-0-11 HCA-1" lid 19 4xQDR
[20] "H-0002c9030042e300"[1](2c9030042e301) # "compute-0-10 HCA-1" lid 6 4xQDR
[21] "H-0002c9030042e390"[1](2c9030042e391) # "compute-0-7 HCA-1" lid 32 4xQDR
[22] "H-0002c9030042e450"[1](2c9030042e451) # "compute-0-6 HCA-1" lid 22 4xQDR
[23] "H-0002c9030042e380"[1](2c9030042e381) # "compute-0-13 HCA-1" lid 28 4xQDR
[24] "H-0002c9030042e360"[1](2c9030042e361) # "compute-0-14 HCA-1" lid 24 4xQDR
[25] "H-0002c9030042e320"[1](2c9030042e321) # "compute-0-9 HCA-1" lid 14 4xQDR
[26] "H-0002c9030042e3e0"[1](2c9030042e3e1) # "compute-0-8 HCA-1" lid 38 4xQDR
[27] "H-0002c9030042e330"[1](2c9030042e331) # "compute-0-2 HCA-1" lid 17 4xQDR
[28] "H-24be05ffff856df0"[1](24be05ffff856df1) # "compute-0-29 HCA-1" lid 39 4xQDR
[29] "H-0002c90300423a10"[1](2c90300423a11) # "compute-0-32 HCA-1" lid 9 4xQDR
[30] "H-24be05ffff85bc90"[1](24be05ffff85bc91) # "compute-0-23 HCA-1" lid 31 4xQDR
[31] "H-24be05ffff85bce0"[1](24be05ffff85bce1) # "compute-0-24 HCA-1" lid 36 4xQDR
vendid=0x2c9
devid=0xc738
sysimgguid=0x2c90300903600
switchguid=0x2c90300903600(2c90300903600)
Switch 36 "S-0002c90300903600" # "SwitchX - Mellanox Technologies" base port 0 lid 3 lmc 0
[1] "S-0002c90300902c00"[1] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[2] "S-0002c90300902c00"[2] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[3] "S-0002c90300902c00"[3] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[4] "S-0002c90300902c00"[4] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[5] "S-0002c90300902c00"[5] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[6] "S-0002c90300902c00"[6] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[7] "S-0002c90300902c00"[7] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[8] "S-0002c90300902c00"[8] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[9] "S-0002c90300902c00"[9] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[10] "S-0002c90300901480"[19] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[11] "S-0002c90300901480"[20] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[12] "S-0002c90300901480"[21] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[13] "S-0002c90300901480"[22] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[14] "S-0002c90300901480"[23] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[15] "S-0002c90300901480"[24] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[16] "S-0002c90300901480"[25] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[17] "S-0002c90300901480"[26] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[18] "S-0002c90300901480"[27] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[19] "H-0002c90300524982"[1](2c90300524983) # "oss2 mlx4_0" lid 29 4xQDR
[20] "H-0002c90300524992"[1](2c90300524993) # "mds2 mlx4_0" lid 33 4xQDR
[21] "H-0002c9030052499a"[1](2c9030052499b) # "mds1 mlx4_0" lid 34 4xQDR
[22] "H-0002c9030052498e"[1](2c9030052498f) # "oss1 mlx4_0" lid 30 4xQDR
[23] "H-24be05ffff85cc20"[1](24be05ffff85cc21) # "compute-0-28 HCA-1" lid 13 4xQDR
[24] "H-24be05ffff85bcf0"[1](24be05ffff85bcf1) # "compute-0-26 HCA-1" lid 40 4xQDR
[25] "H-24be05ffff85cc30"[1](24be05ffff85cc31) # "compute-0-30 HCA-1" lid 16 4xQDR
[26] "H-0002c9030041f800"[1](2c9030041f801) # "compute-0-31 HCA-1" lid 8 4xQDR
[27] "H-24be05ffff856d00"[1](24be05ffff856d01) # "compute-0-27 HCA-1" lid 5 4xQDR
[28] "H-24be05ffff85cc60"[1](24be05ffff85cc61) # "compute-0-25 HCA-1" lid 23 4xQDR
[29] "H-0002c9030041f7f0"[1](2c9030041f7f1) # "compute-0-34 HCA-1" lid 44 4xQDR
[30] "H-0002c9030041f840"[1](2c9030041f841) # "compute-0-33 HCA-1" lid 20 4xQDR
vendid=0x2c9
devid=0xc738
sysimgguid=0x2c90300902c00
switchguid=0x2c90300902c00(2c90300902c00)
Switch 36 "S-0002c90300902c00" # "SwitchX - Mellanox Technologies" base port 0 lid 2 lmc 0
[1] "S-0002c90300903600"[1] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[2] "S-0002c90300903600"[2] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[3] "S-0002c90300903600"[3] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[4] "S-0002c90300903600"[4] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[5] "S-0002c90300903600"[5] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[6] "S-0002c90300903600"[6] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[7] "S-0002c90300903600"[7] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[8] "S-0002c90300903600"[8] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[9] "S-0002c90300903600"[9] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[10] "S-0002c903008fd500"[10] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[11] "S-0002c903008fd500"[11] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[12] "S-0002c903008fd500"[12] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[13] "S-0002c903008fd500"[13] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[14] "S-0002c903008fd500"[14] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[15] "S-0002c903008fd500"[15] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[16] "S-0002c903008fd500"[16] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[17] "S-0002c903008fd500"[17] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[18] "S-0002c903008fd500"[18] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[19] "S-0002c9030075c870"[10] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[20] "S-0002c9030075c870"[11] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[21] "S-0002c9030075c870"[12] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[23] "S-0002c9030075c870"[14] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[24] "S-0002c9030075c870"[15] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[25] "S-0002c9030075c870"[16] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[26] "S-0002c9030075c870"[17] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[27] "S-0002c9030075c870"[18] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0xc738
sysimgguid=0x2c90300901480
switchguid=0x2c90300901480(2c90300901480)
Switch 36 "S-0002c90300901480" # "SwitchX - Mellanox Technologies" base port 0 lid 27 lmc 0
[1] "S-0002c9030075c870"[1] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[2] "S-0002c9030075c870"[2] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[3] "S-0002c9030075c870"[3] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[4] "S-0002c9030075c870"[4] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[5] "S-0002c9030075c870"[5] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[6] "S-0002c9030075c870"[6] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[7] "S-0002c9030075c870"[7] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[8] "S-0002c9030075c870"[8] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[9] "S-0002c9030075c870"[9] # "SwitchX - Mellanox Technologies" lid 25 4xQDR
[10] "S-0002c903008fd500"[1] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[11] "S-0002c903008fd500"[2] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[12] "S-0002c903008fd500"[3] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[13] "S-0002c903008fd500"[4] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[14] "S-0002c903008fd500"[5] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[15] "S-0002c903008fd500"[6] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[16] "S-0002c903008fd500"[7] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[17] "S-0002c903008fd500"[8] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[18] "S-0002c903008fd500"[9] # "SwitchX - Mellanox Technologies" lid 4 4xQDR
[19] "S-0002c90300903600"[10] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[20] "S-0002c90300903600"[11] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[21] "S-0002c90300903600"[12] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[22] "S-0002c90300903600"[13] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[23] "S-0002c90300903600"[14] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[24] "S-0002c90300903600"[15] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[25] "S-0002c90300903600"[16] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[26] "S-0002c90300903600"[17] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
[27] "S-0002c90300903600"[18] # "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0xc738
sysimgguid=0x2c9030075c870
switchguid=0x2c9030075c870(2c9030075c870)
Switch 36 "S-0002c9030075c870" # "SwitchX - Mellanox Technologies" base port 0 lid 25 lmc 0
[1] "S-0002c90300901480"[1] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[2] "S-0002c90300901480"[2] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[3] "S-0002c90300901480"[3] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[4] "S-0002c90300901480"[4] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[5] "S-0002c90300901480"[5] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[6] "S-0002c90300901480"[6] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[7] "S-0002c90300901480"[7] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[8] "S-0002c90300901480"[8] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[9] "S-0002c90300901480"[9] # "SwitchX - Mellanox Technologies" lid 27 4xQDR
[10] "S-0002c90300902c00"[19] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[11] "S-0002c90300902c00"[20] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[12] "S-0002c90300902c00"[21] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[14] "S-0002c90300902c00"[23] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[15] "S-0002c90300902c00"[24] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[16] "S-0002c90300902c00"[25] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[17] "S-0002c90300902c00"[26] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[18] "S-0002c90300902c00"[27] # "SwitchX - Mellanox Technologies" lid 2 4xQDR
[19] "H-0002c9030042e250"[1](2c9030042e251) # "compute-0-20 HCA-1" lid 21 4xQDR
[20] "H-0002c9030042e310"[1](2c9030042e311) # "compute-0-17 HCA-1" lid 10 4xQDR
[21] "H-0002c9030042e4f0"[1](2c9030042e4f1) # "compute-0-3 HCA-1" lid 43 4xQDR
[22] "H-0002c9030042e2e0"[1](2c9030042e2e1) # "compute-0-19 HCA-1" lid 37 4xQDR
[23] "H-0002c9030042e520"[1](2c9030042e521) # "compute-0-18 HCA-1" lid 15 4xQDR
[24] "H-0002c9030042e3f0"[1](2c9030042e3f1) # "compute-0-22 HCA-1" lid 42 4xQDR
[25] "H-0002c9030042e400"[1](2c9030042e401) # "compute-0-1 HCA-1" lid 7 4xQDR
[26] "H-0002c9030042e430"[1](2c9030042e431) # "compute-0-15 HCA-1" lid 18 4xQDR
[27] "H-0002c9030042e420"[1](2c9030042e421) # "nalanda HCA-1" lid 1 4xQDR
[28] "H-0002c9030042e510"[1](2c9030042e511) # "compute-0-21 HCA-1" lid 12 4xQDR
[29] "H-0002c9030042e470"[1](2c9030042e471) # "compute-0-16 HCA-1" lid 26 4xQDR
[30] "H-0002c9030042e2f0"[1](2c9030042e2f1) # "compute-0-12 HCA-1" lid 41 4xQDR
[31] "H-0002c9030042e410"[1](2c9030042e411) # "compute-0-5 HCA-1" lid 11 4xQDR
[32] "H-0002c9030042e2d0"[1](2c9030042e2d1) # "compute-0-4 HCA-1" lid 35 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030041f843
caguid=0x2c9030041f840
Ca 2 "H-0002c9030041f840" # "compute-0-33 HCA-1"
[1](2c9030041f841) "S-0002c90300903600"[30] # lid 20 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030041f7f3
caguid=0x2c9030041f7f0
Ca 2 "H-0002c9030041f7f0" # "compute-0-34 HCA-1"
[1](2c9030041f7f1) "S-0002c90300903600"[29] # lid 44 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85cc63
caguid=0x24be05ffff85cc60
Ca 2 "H-24be05ffff85cc60" # "compute-0-25 HCA-1"
[1](24be05ffff85cc61) "S-0002c90300903600"[28] # lid 23 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff856d03
caguid=0x24be05ffff856d00
Ca 2 "H-24be05ffff856d00" # "compute-0-27 HCA-1"
[1](24be05ffff856d01) "S-0002c90300903600"[27] # lid 5 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030041f803
caguid=0x2c9030041f800
Ca 2 "H-0002c9030041f800" # "compute-0-31 HCA-1"
[1](2c9030041f801) "S-0002c90300903600"[26] # lid 8 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85cc33
caguid=0x24be05ffff85cc30
Ca 2 "H-24be05ffff85cc30" # "compute-0-30 HCA-1"
[1](24be05ffff85cc31) "S-0002c90300903600"[25] # lid 16 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85bcf3
caguid=0x24be05ffff85bcf0
Ca 2 "H-24be05ffff85bcf0" # "compute-0-26 HCA-1"
[1](24be05ffff85bcf1) "S-0002c90300903600"[24] # lid 40 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85cc23
caguid=0x24be05ffff85cc20
Ca 2 "H-24be05ffff85cc20" # "compute-0-28 HCA-1"
[1](24be05ffff85cc21) "S-0002c90300903600"[23] # lid 13 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x673c
sysimgguid=0x2c90300524991
caguid=0x2c9030052498e
Ca 2 "H-0002c9030052498e" # "oss1 mlx4_0"
[1](2c9030052498f) "S-0002c90300903600"[22] # lid 30 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x673c
sysimgguid=0x2c9030052499d
caguid=0x2c9030052499a
Ca 2 "H-0002c9030052499a" # "mds1 mlx4_0"
[1](2c9030052499b) "S-0002c90300903600"[21] # lid 34 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x673c
sysimgguid=0x2c90300524995
caguid=0x2c90300524992
Ca 2 "H-0002c90300524992" # "mds2 mlx4_0"
[1](2c90300524993) "S-0002c90300903600"[20] # lid 33 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x673c
sysimgguid=0x2c90300524985
caguid=0x2c90300524982
Ca 2 "H-0002c90300524982" # "oss2 mlx4_0"
[1](2c90300524983) "S-0002c90300903600"[19] # lid 29 lmc 0 "SwitchX - Mellanox Technologies" lid 3 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85bce3
caguid=0x24be05ffff85bce0
Ca 2 "H-24be05ffff85bce0" # "compute-0-24 HCA-1"
[1](24be05ffff85bce1) "S-0002c903008fd500"[31] # lid 36 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff85bc93
caguid=0x24be05ffff85bc90
Ca 2 "H-24be05ffff85bc90" # "compute-0-23 HCA-1"
[1](24be05ffff85bc91) "S-0002c903008fd500"[30] # lid 31 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c90300423a13
caguid=0x2c90300423a10
Ca 2 "H-0002c90300423a10" # "compute-0-32 HCA-1"
[1](2c90300423a11) "S-0002c903008fd500"[29] # lid 9 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x24be05ffff856df3
caguid=0x24be05ffff856df0
Ca 2 "H-24be05ffff856df0" # "compute-0-29 HCA-1"
[1](24be05ffff856df1) "S-0002c903008fd500"[28] # lid 39 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e333
caguid=0x2c9030042e330
Ca 2 "H-0002c9030042e330" # "compute-0-2 HCA-1"
[1](2c9030042e331) "S-0002c903008fd500"[27] # lid 17 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e3e3
caguid=0x2c9030042e3e0
Ca 2 "H-0002c9030042e3e0" # "compute-0-8 HCA-1"
[1](2c9030042e3e1) "S-0002c903008fd500"[26] # lid 38 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e323
caguid=0x2c9030042e320
Ca 2 "H-0002c9030042e320" # "compute-0-9 HCA-1"
[1](2c9030042e321) "S-0002c903008fd500"[25] # lid 14 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e363
caguid=0x2c9030042e360
Ca 2 "H-0002c9030042e360" # "compute-0-14 HCA-1"
[1](2c9030042e361) "S-0002c903008fd500"[24] # lid 24 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e383
caguid=0x2c9030042e380
Ca 2 "H-0002c9030042e380" # "compute-0-13 HCA-1"
[1](2c9030042e381) "S-0002c903008fd500"[23] # lid 28 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e453
caguid=0x2c9030042e450
Ca 2 "H-0002c9030042e450" # "compute-0-6 HCA-1"
[1](2c9030042e451) "S-0002c903008fd500"[22] # lid 22 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e393
caguid=0x2c9030042e390
Ca 2 "H-0002c9030042e390" # "compute-0-7 HCA-1"
[1](2c9030042e391) "S-0002c903008fd500"[21] # lid 32 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e303
caguid=0x2c9030042e300
Ca 2 "H-0002c9030042e300" # "compute-0-10 HCA-1"
[1](2c9030042e301) "S-0002c903008fd500"[20] # lid 6 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e443
caguid=0x2c9030042e440
Ca 2 "H-0002c9030042e440" # "compute-0-11 HCA-1"
[1](2c9030042e441) "S-0002c903008fd500"[19] # lid 19 lmc 0 "SwitchX - Mellanox Technologies" lid 4 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e2d3
caguid=0x2c9030042e2d0
Ca 2 "H-0002c9030042e2d0" # "compute-0-4 HCA-1"
[1](2c9030042e2d1) "S-0002c9030075c870"[32] # lid 35 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e413
caguid=0x2c9030042e410
Ca 2 "H-0002c9030042e410" # "compute-0-5 HCA-1"
[1](2c9030042e411) "S-0002c9030075c870"[31] # lid 11 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e2f3
caguid=0x2c9030042e2f0
Ca 2 "H-0002c9030042e2f0" # "compute-0-12 HCA-1"
[1](2c9030042e2f1) "S-0002c9030075c870"[30] # lid 41 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e473
caguid=0x2c9030042e470
Ca 2 "H-0002c9030042e470" # "compute-0-16 HCA-1"
[1](2c9030042e471) "S-0002c9030075c870"[29] # lid 26 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e513
caguid=0x2c9030042e510
Ca 2 "H-0002c9030042e510" # "compute-0-21 HCA-1"
[1](2c9030042e511) "S-0002c9030075c870"[28] # lid 12 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e433
caguid=0x2c9030042e430
Ca 2 "H-0002c9030042e430" # "compute-0-15 HCA-1"
[1](2c9030042e431) "S-0002c9030075c870"[26] # lid 18 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e403
caguid=0x2c9030042e400
Ca 2 "H-0002c9030042e400" # "compute-0-1 HCA-1"
[1](2c9030042e401) "S-0002c9030075c870"[25] # lid 7 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e3f3
caguid=0x2c9030042e3f0
Ca 2 "H-0002c9030042e3f0" # "compute-0-22 HCA-1"
[1](2c9030042e3f1) "S-0002c9030075c870"[24] # lid 42 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e523
caguid=0x2c9030042e520
Ca 2 "H-0002c9030042e520" # "compute-0-18 HCA-1"
[1](2c9030042e521) "S-0002c9030075c870"[23] # lid 15 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e2e3
caguid=0x2c9030042e2e0
Ca 2 "H-0002c9030042e2e0" # "compute-0-19 HCA-1"
[1](2c9030042e2e1) "S-0002c9030075c870"[22] # lid 37 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e4f3
caguid=0x2c9030042e4f0
Ca 2 "H-0002c9030042e4f0" # "compute-0-3 HCA-1"
[1](2c9030042e4f1) "S-0002c9030075c870"[21] # lid 43 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e313
caguid=0x2c9030042e310
Ca 2 "H-0002c9030042e310" # "compute-0-17 HCA-1"
[1](2c9030042e311) "S-0002c9030075c870"[20] # lid 10 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e253
caguid=0x2c9030042e250
Ca 2 "H-0002c9030042e250" # "compute-0-20 HCA-1"
[1](2c9030042e251) "S-0002c9030075c870"[19] # lid 21 lmc 0 "SwitchX - Mellanox Technologies" lid 25 4xQDR
vendid=0x2c9
devid=0x1003
sysimgguid=0x2c9030042e423
caguid=0x2c9030042e420
Ca 2 "H-0002c9030042e420" # "nalanda HCA-1"
[1](2c9030042e421) "S-0002c9030075c870"[27]
[root at nalanda mvapich2-1.9]# ibhosts
Ca : 0x0002c9030041f840 ports 2 "compute-0-33 HCA-1"
Ca : 0x0002c9030041f7f0 ports 2 "compute-0-34 HCA-1"
Ca : 0x24be05ffff85cc60 ports 2 "compute-0-25 HCA-1"
Ca : 0x24be05ffff856d00 ports 2 "compute-0-27 HCA-1"
Ca : 0x0002c9030041f800 ports 2 "compute-0-31 HCA-1"
Ca : 0x24be05ffff85cc30 ports 2 "compute-0-30 HCA-1"
Ca : 0x24be05ffff85bcf0 ports 2 "compute-0-26 HCA-1"
Ca : 0x24be05ffff85cc20 ports 2 "compute-0-28 HCA-1"
Ca : 0x0002c9030052498e ports 2 "oss1 mlx4_0"
Ca : 0x0002c9030052499a ports 2 "mds1 mlx4_0"
Ca : 0x0002c90300524992 ports 2 "mds2 mlx4_0"
Ca : 0x0002c90300524982 ports 2 "oss2 mlx4_0"
Ca : 0x24be05ffff85bce0 ports 2 "compute-0-24 HCA-1"
Ca : 0x24be05ffff85bc90 ports 2 "compute-0-23 HCA-1"
Ca : 0x0002c90300423a10 ports 2 "compute-0-32 HCA-1"
Ca : 0x24be05ffff856df0 ports 2 "compute-0-29 HCA-1"
Ca : 0x0002c9030042e330 ports 2 "compute-0-2 HCA-1"
Ca : 0x0002c9030042e3e0 ports 2 "compute-0-8 HCA-1"
Ca : 0x0002c9030042e320 ports 2 "compute-0-9 HCA-1"
Ca : 0x0002c9030042e360 ports 2 "compute-0-14 HCA-1"
Ca : 0x0002c9030042e380 ports 2 "compute-0-13 HCA-1"
Ca : 0x0002c9030042e450 ports 2 "compute-0-6 HCA-1"
Ca : 0x0002c9030042e390 ports 2 "compute-0-7 HCA-1"
Ca : 0x0002c9030042e300 ports 2 "compute-0-10 HCA-1"
Ca : 0x0002c9030042e440 ports 2 "compute-0-11 HCA-1"
Ca : 0x0002c9030042e2d0 ports 2 "compute-0-4 HCA-1"
Ca : 0x0002c9030042e410 ports 2 "compute-0-5 HCA-1"
Ca : 0x0002c9030042e2f0 ports 2 "compute-0-12 HCA-1"
Ca : 0x0002c9030042e470 ports 2 "compute-0-16 HCA-1"
Ca : 0x0002c9030042e510 ports 2 "compute-0-21 HCA-1"
Ca : 0x0002c9030042e430 ports 2 "compute-0-15 HCA-1"
Ca : 0x0002c9030042e400 ports 2 "compute-0-1 HCA-1"
Ca : 0x0002c9030042e3f0 ports 2 "compute-0-22 HCA-1"
Ca : 0x0002c9030042e520 ports 2 "compute-0-18 HCA-1"
Ca : 0x0002c9030042e2e0 ports 2 "compute-0-19 HCA-1"
Ca : 0x0002c9030042e4f0 ports 2 "compute-0-3 HCA-1"
Ca : 0x0002c9030042e310 ports 2 "compute-0-17 HCA-1"
Ca : 0x0002c9030042e250 ports 2 "compute-0-20 HCA-1"
Ca : 0x0002c9030042e420 ports 2 "nalanda HCA-1"
[root at nalanda mvapich2-1.9]#
[root at nalanda mvapich2-1.9]# ibswitches
Switch : 0x0002c903008fd500 ports 36 "SwitchX - Mellanox Technologies" base port 0 lid 4 lmc 0
Switch : 0x0002c90300903600 ports 36 "SwitchX - Mellanox Technologies" base port 0 lid 3 lmc 0
Switch : 0x0002c90300902c00 ports 36 "SwitchX - Mellanox Technologies" base port 0 lid 2 lmc 0
Switch : 0x0002c90300901480 ports 36 "SwitchX - Mellanox Technologies" base port 0 lid 27 lmc 0
Switch : 0x0002c9030075c870 ports 36 "SwitchX - Mellanox Technologies" base port 0 lid 25 lmc 0
[root at nalanda mvapich2-1.9]#
More information about the ewg
mailing list