***SPAM*** Re: [ofa-general] OpenSM Problems/Questions
Hal Rosenstock
hal.rosenstock at gmail.com
Tue Sep 9 11:35:48 PDT 2008
Hi,
On Tue, Sep 9, 2008 at 9:01 AM, Matthew Trzyna <trzyna at us.ibm.com> wrote:
> Hello
>
>
> A "Basic Fabric Diagram" at the end.
>
>
> I am working with a customer implementing a large IB fabric and is
> encountering problems with OpenSM (OFED 1.3) when they added a new 264 node
> cluster (with its own 288 port IB switch) to their existing cluster. Two
> more 264 clusters are planned to be added in the near future. They recently
> moved to SLES 10 SP1 and OFED 1.3 (before adding the new cluster) and had
> not been experiencing these problems before.
>
> Could you help provide answers to the questions listed below? Additional
> information about the configuration including a basic fabric diagram are
> provided after the questions.
>
> What parameters should be set on the non-SM nodes that affect how the Subnet
> Administrator functions?
> What parameters should be set on the SM node(s) that affect how the Subnet
> Administrator functions? And, what parameters should be removed from the SM
> node(s)? (ie. ib_sa paths_per_dest=0x7f)
> How should SM failover be setup? How many failover SM's should be
> configured? This must happen quickly and transparently or GPFS will die
> everywhere due to timeouts if this takes too long).
What is quickly enough ?
> Are there SA (Subnet Administrator) commands that should not be executed on
> a large "live" fabric? (ie. "saquery -p")
> Should GPFS be configured "off" on the SM node(s)?
> Do you know of any other OpenSM implementations that have 5 (or more) 288
> port IB switches that might have already encountered/resolved some of these
> issues?
There are some deployments with multiple large switches deployed.
Not sure what you mean by issues; I see questions above.
> The following problem that is being encountered may also be SA/SM related. A
> node (NodeX) may be seen (through IPoIB) by all but a few nodes (NodesA-G).
> A ping from those node (NodesA-G) to NodeX returns "Destination Host
> Unreachable". A ping from NodeX to NodesA-G works.
Sounds like perhaps those nodes were unable to join the broadcast
group perhaps due to a rate issue.
-- Hal
> --------------------------------------------------------------------------------------------------
>
> System Information
>
> Here is the current opensm.conf file: (See attached file: opensm.conf)
>
> It is the default configuration from the OFED 1.3 build with "priority"
> added at the bottom. Note that the /etc/init.d/opensmd sources
> /etc/sysconfig/opensm not etc/sysconfig/opensm.conf (opensm.conf was just
> copied to opensm). There are a couple of "proposed" settings that are
> commented out, that were found them on the web.
>
> Following are the present settings that may affect the Fabric:
>
> /etc/infiniband/openib.conf
> SET_IPOIB_CM=no
>
> /etc/modprobe.conf.local
> options ib_ipoib send_queue_size=512 recv_queue_size=512
> options ib_sa paths_per_dest=0x7f
>
> /etc/sysctl.conf
> net.ipv4.neigh.ib0.base_reachable_time = 1200
> net.ipv4.neigh.default.gc_thresh3 = 3072
> net.ipv4.neigh.default.gc_thresh2 = 2500
> net.ipv4.neigh.default.gc_thresh1 = 2048
>
> /etc/sysconfig/opensm
> All defaults as supplied with OFED 1.3 OpenSM
>
>
> -------------------------------------------------------
>
>
> Basic Fabric Diagram
>
> +----------+
> |Top Level |-------------------+ 20 IO nodes
> +-----------------| 288 port |----------------+ 16 Viual nodes
> | | IB Sw |------------+ | 2 Admin nodes
> | +------| |---+ | | (SM nodes)
> | | +----------+ | | | 4 Support nodes
> | | | | | |
> | | | | | |
> 24 24 24 24 24 24 <--uplinks
> | | | | | |
> | | | | | +------+
> | | | | | |
> |(BASE) |(SCU1) |(SCU2) |(SCU3) |(SCU4) |(SCU5)
> +--------+ +--------+ +--------+ +--------+ +--------+ +--------+
> |288-port| |288-port| |288-port| |288-port| |288-port| |288-port|
> | IB Sw | | IB Sw | | IB Sw | | IB Sw | | IB Sw | | IB Sw |
> +--------+ +--------+ +--------+ +--------+ +--------+ +--------+
> 140-nodes 264-nodes 264-nodes 264-nodes 264-nodes 264-nodes
> WhiteBox Dell Dell IBM IBM IBM (future)
>
> NOTE: SCU4 is not currently connected to the Top Level Switch.
> We'd like to address these issues before making that connection.
>
> Subnet Managers are configured on nodes connected to the
> Top Leval Switch.
>
> Let me know if you need any more information.
>
> Any help you could provide would be most appreciated.
>
> Thanks.
>
> Matt Trzyna
> IBM Linux Cluster Enablement
> 3039 Cornwallis Rd.
> RTP, NC 27709
> e-mail: trzyna at us.ibm.com
> Office: (919) 254-9917 Tie Line: 444
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list