[ewg] Opensm for dual GUID

Hal Rosenstock hal at dev.mellanox.co.il
Tue Feb 24 05:37:28 PST 2015


On 2/24/2015 8:34 AM, Atul Yadav wrote:
> Hi Rosenstock,
> 
> Thanks for giving quick response.
> 
> We are making the change in configuration as per the input.
> 
> But we are using single IP address for Infiniband  bond0: 192.168.1.1
>  (IB0 + IB2) Active - Passive Bonding.
> 
> And both the switch are independen as per the attached diagram.

IP bonding configuration is independent of OpenSM configuration.

> Is this configuration is valid for HCA and Switch level failover
> requirement.

Yes, it's valid but this configuration is not the most fault tolerant.
It is better to run the multiple OpenSM instances on separate servers if
possible.

-- Hal

> Thank You
> Atul Yadav
> 
> 
> 
> 
> 
> 
> On Tue, Feb 24, 2015 at 6:55 PM, Hal Rosenstock <hal at dev.mellanox.co.il
> <mailto:hal at dev.mellanox.co.il>> wrote:
> 
>     On 2/24/2015 7:33 AM, Atul Yadav wrote:
>     > Hi Rosenstock,
>     >
>     > Thanks for responding.
>     >
>     > As per our requirement we want to achieve IB bonding for  HCA and
>     Switch
>     > level fail-over.
>     >
>     > bond0 (Active- Passive)
>     > Please provide opensm configuration parameter
> 
>     You need to do something along the following lines:
> 
>     First, create 2 config files (say opensm-qib0.conf and opensm-qib1.conf)
>     with the following variables changed as follows:
> 
>     opensm-qib0.conf:
>     # The port GUID on which the OpenSM is running
>     guid 0x00117500006f5f4c
> 
>     # Log file to be used
>     log_file /var/log/opensm-qib0/opensm.log
> 
>     opensm-qib1.conf:
>     # The port GUID on which the OpenSM is running
>     guid 0x00117500006f5f4a
> 
>     # Log file to be used
>     log_file /var/log/opensm-qib1/opensm.log
> 
> 
>     then make sure that the following directories exist:
>     /var/cache/opensm-qib0
>     /var/log/opensm-qib0
>     /var/cache/opensm-qib1
>     /var/log/opensm-qib1
> 
> 
>     and then:
>     export OSM_CACHE_DIR=/var/cache/opensm-qib0
>     export OSM_TMP_DIR=/var/log/opensm-qib0
>     opensm -F opensm-qib0.conf &
> 
>     export OSM_CACHE_DIR=/var/cache/opensm-qib1
>     export OSM_TMP_DIR=/var/log/opensm-qib1
>     opensm -F opensm-qib1.conf &
> 
> 
>     A similar alternative configuration approach is described in:
>     https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg03557.html
> 
>     -- Hal
> 
>     >
>     > [root at SRDCB0970RTGMS opensm]# ibstat
>     > CA 'qib0'
>     >         CA type: InfiniPath_QLE7342
>     >         Number of ports: 2
>     >         Firmware version:
>     >         Hardware version: 2
>     >         Node GUID: 0x00117500006f5f4c
>     >         System image GUID: 0x00117500006f5f4c
>     >         Port 1:
>     >                 State: Active
>     >                 Physical state: LinkUp
>     >                 Rate: 40
>     >                 Base lid: 1
>     >                 LMC: 0
>     >                 SM lid: 1
>     >                 Capability mask: 0x0761086a
>     >                 Port GUID: 0x00117500006f5f4c
>     >                 Link layer: InfiniBand
>     >         Port 2:
>     >                 State: Down
>     >                 Physical state: Disabled
>     >                 Rate: 10
>     >                 Base lid: 65535
>     >                 LMC: 0
>     >                 SM lid: 65535
>     >                 Capability mask: 0x07610868
>     >                 Port GUID: 0x00117500006f5f4d
>     >                 Link layer: InfiniBand
>     > CA 'qib1'
>     >         CA type: InfiniPath_QLE7342
>     >         Number of ports: 2
>     >         Firmware version:
>     >         Hardware version: 2
>     >         Node GUID: 0x00117500006f5f4a
>     >         System image GUID: 0x00117500006f5f4c
>     >         Port 1:
>     >                 State: Initializing
>     >                 Physical state: LinkUp
>     >                 Rate: 40
>     >                 Base lid: 65535
>     >                 LMC: 0
>     >                 SM lid: 65535
>     >                 Capability mask: 0x07610868
>     >                 Port GUID: 0x00117500006f5f4a
>     >                 Link layer: InfiniBand
>     >         Port 2:
>     >                 State: Down
>     >                 Physical state: Disabled
>     >                 Rate: 10
>     >                 Base lid: 65535
>     >                 LMC: 0
>     >                 SM lid: 65535
>     >                 Capability mask: 0x07610868
>     >                 Port GUID: 0x00117500006f5f4b
>     >                 Link layer: InfiniBand
>     >
>     > [root at SRDCB0970RTGMS opensm]#
>     >
>     > [root at SRDCB0970RTGMS ~]# ibstat -p
>     > 0x00117500006f5f4c
>     > 0x00117500006f5f4d
>     > 0x00117500006f5f4a
>     >
>     > 0x00117500006f5f4b
>     >
>     >
>     > [root at SRDCB0970RTGMS ~]# cat
>     /etc/sysconfig/network-scripts/ifcfg-bond0
>     > DEVICE=bond0
>     > IPADDR=192.168.1.1
>     > NETMASK=255.255.255.0 <tel:255.255.255.0>
>     > BROADCAST=192.168.1.255 <tel:192.168.1.255>
>     > ONBOOT=yes
>     > BOOTPROTO=none
>     > USERCTL=no
>     > MTU=65520
>     > BONDING_OPTS=" mode=1 primary=ib0 updelay=0 downdelay=0"
>     > [root at SRDCB0970RTGMS ~]# cat /etc/sysconfig/network-scripts/ifcfg-ib*
>     > DEVICE=ib0
>     > USERCTL=no
>     > ONBOOT=yes
>     > MASTER=bond0
>     > SLAVE=yes
>     > BOOTPROTO=none
>     > TYPE=Infiniband
>     > PRIMARY=yes
>     > DEVICE=ib1
>     > USERCTL=no
>     > ONBOOT=yes
>     > MASTER=bond0
>     > SLAVE=yes
>     > BOOTPROTO=none
>     > TYPE=Infiniband
>     > DEVICE=ib2
>     > USERCTL=no
>     > ONBOOT=yes
>     > MASTER=bond0
>     > SLAVE=yes
>     > BOOTPROTO=none
>     > TYPE=Infiniband
>     > DEVICE=ib3
>     > USERCTL=no
>     > ONBOOT=yes
>     > MASTER=bond0
>     > SLAVE=yes
>     > BOOTPROTO=none
>     > TYPE=Infiniband
>     > [root at SRDCB0970RTGMS ~]#
>     >
>     >
>     >
>     > Thank You
>     >
>     > Atul Yadav
>     >
>     >
>     >
>     >
>     >
>     > On Tue, Feb 24, 2015 at 5:55 PM, Hal Rosenstock
>     <hal at dev.mellanox.co.il <mailto:hal at dev.mellanox.co.il>
>     > <mailto:hal at dev.mellanox.co.il <mailto:hal at dev.mellanox.co.il>>> wrote:
>     >
>     >     On 2/24/2015 5:19 AM, Atul Yadav wrote:
>     >     > Hi Team,
>     >     >
>     >     > We are trying to setup the HCA and Switch level failover.
>     >     >
>     >     > Operating System:- Centos 6.5
>     >     >
>     >     > Please guide us
>     >
>     >     To run OpenSM on multiple ports/HCAs on the same machine for
>     the same
>     >     subnet, multiple instances of OpenSM need to be invoked one per
>     port/HCA
>     >     and need separate but similar configuration.
>     >
>     >     -- Hal
>     >
>     >     >
>     >     > Thank You
>     >     >
>     >     > Atul Yadav
>     >     >
>     >     >
>     >     >
>     >     > _______________________________________________
>     >     > ewg mailing list
>     >     > ewg at lists.openfabrics.org <mailto:ewg at lists.openfabrics.org>
>     <mailto:ewg at lists.openfabrics.org <mailto:ewg at lists.openfabrics.org>>
>     >     > http://lists.openfabrics.org/mailman/listinfo/ewg
>     >
>     >
> 
> 




More information about the ewg mailing list