[Users] PortXmitWait?
Mehdi Denou
mehdi.denou at bull.net
Fri Mar 14 15:12:32 PDT 2014
Hi Florent,
if you have a FT topology, 10 cables can have a VERY big impact on the
fabric overall performance (the fat tree is not very fault tolerant).
On 14/03/2014 16:17, Florent Parent wrote:
>
> Hi Jeff,
>
> I'm collecting data to do analysis over time, and indeed there are no
> XmitDiscards. I will add the XmitWait counters to watch for "hot
> spots" over time.
>
> We do have links down in some CXP cables (10 ports total, all spread
> out in different cables). I will check if there is any correlation
> with the observed XmitWait counters.
>
> The PCIe gen2 bus width is x8 for the QDR chip on the blades. Gen2
> provides 4Gbps per lane, so x8 would provide 32Gbps, which matches the
> QDR data rate. Or is my math wrong?
>
> Thanks for the Oracle document pointer, I guess I should have known
> about it's existence :)
>
> Thanks for the response, and good to hear from you.
> Florent
>
>
> On Fri, Mar 14, 2014 at 6:46 AM, Le Fillatre Jean-Francois
> <jf.lefillatre at univ-lille1.fr <mailto:jf.lefillatre at univ-lille1.fr>>
> wrote:
>
>
> Hello Florent,
>
> I'll go for the congestion explanation too. It's not a major issue
> as there seems to be no discarded packets from the few lines of
> data included in the original email. The number of PortXmitWaits
> is orders of magnitude below the number of PortXmitDatas, and the
> numbers don't seem to be directly correlated. So it looks like
> they're not regular events with a clear pattern.
>
> Oracle's Infiniband network troubleshooting guide says:
> PortXmitWait : The number of ticks during which the port selected
> by PortSelect had data to transmit but no data was sent during the
> entire tick either because of insufficient credits or because of
> lack of arbitration.
>
> The important part of that sentence is "insufficient credit". The
> three reasons given by Hal will indeed all cause insufficient
> credit at some point of the link between two nodes. Any of those
> may cause the sendind HBA to pause and wait until there is
> available credit all the way, therefore increasing the
> PortXmitWait count.
>
> From what I remember of your site:
> - do you still have some links down on some of your cables?
> - some users do indeed do many-to-one MPI communications
> - I can't remember if the QDR chip has an x8 or x16 PCIe
> connection with the board. If it's x8 then the IB chip will be
> able to saturate the PCIe bus, thus limiting the IB rate at peak
> use times.
>
> Full Oracle document there:
> http://www.oracle.com/technetwork/database/availability/infiniband-network-troubleshooting-1863251.pdf
>
> Thanks,
> JF
>
>
>
> On Thursday, March 13, 2014 12:36 CET, Hal Rosenstock
> <hal.rosenstock at gmail.com <mailto:hal.rosenstock at gmail.com>> wrote:
>
> > Some causes of congestion are: slow receiver, many to one
> communication,
> > and "poor" fat tree topology.
> >
> > On the last item, are all links in the subnet same speed and
> width ? How
> > many links are used going up the fat tree to the next rank ?
> >
> > Are all end nodes connected to rank 2 or are any connected to
> higher rank ?
> >
> > Are there any "combined" nodes ? By this I mean, some device
> which is more
> > than just single switch or CA. If so, what are they and where do
> they live
> > in the topology ?
> >
> >
> > On Wed, Mar 12, 2014 at 11:50 PM, Hal Rosenstock
> > <hal.rosenstock at gmail.com <mailto:hal.rosenstock at gmail.com>>wrote:
> >
> > > By the fact that you didn't mention PortXmitDiscards, does it
> mean that
> > > these are 0 ? Assuming so, PortXmitWait is indicating there is
> some
> > > congestion but it has not risen to the level of dropping
> packets. It's the
> > > rate of increase of the XmitWait counter that's important
> rather than the
> > > absolute number so if you want to chase this, the focus should
> be on the
> > > ports most congested.
> > >
> > > Since the old tool didn't report XmitWait counters, it's hard
> to know
> > > whether this is the same as before or not unless you did this
> manually.
> > >
> > > Was the routing previously fat tree ? Are there any other fat
> tree related
> > > log messages in the OpenSM log ? Is there any fat tree
> configuration of
> > > compute and/or I/O nodes ?
> > >
> > > Any idea on what is the traffic pattern ? Are you running MPI ?
> > >
> > > -- Hal
> > >
> > >
> > > On Wed, Mar 12, 2014 at 8:17 PM, Florent Parent <
> > > florent.parent at calculquebec.ca
> <mailto:florent.parent at calculquebec.ca>> wrote:
> > >
> > >>
> > >> Hello IB users,
> > >>
> > >> We recently migrated our opensm from 3.2.6 to 3.3.17. In this
> upgrade, we
> > >> moved to CentOS6.5 with the stock RDMA and
> infiniband-diags_1.5.12-5., and
> > >> running opensm 3.3.17. Routing is FatTree:
> > >> General fabric topology info
> > >> ============================
> > >> - FatTree rank (roots to leaf switches): 3
> > >> - FatTree max switch rank: 2
> > >> - Fabric has 966 CAs, 966 CA ports (603 of them CNs), 186
> switches
> > >> - Fabric has 36 switches at rank 0 (roots)
> > >> - Fabric has 64 switches at rank 1
> > >> - Fabric has 86 switches at rank 2 (86 of them leafs)
> > >>
> > >> Now to the question: ibqueryerrors 1.5.12 is reporting high
> PortXmitWait
> > >> values throughout the fabric. We did not see this counter
> before (it was
> > >> not reported by the older ibqueryerrors.pl
> <http://ibqueryerrors.pl>)
> > >>
> > >> To give an idea of the scale of the counters, here's a capture of
> > >> ibqueryerrors --data on one specific I4 switch, 10 seconds
> after clearing
> > >> the counters (-k -K):
> > >>
> > >> GUID 0x21283a83b30050 port 4: PortXmitWait == 2932676
> PortXmitData ==
> > >> 90419517 (344.923MB) PortRcvData == 1526963011 (5.688GB)
> > >> GUID 0x21283a83b30050 port 5: PortXmitWait == 3110105
> PortXmitData ==
> > >> 509580912 (1.898GB) PortRcvData == 13622 (53.211KB)
> > >> GUID 0x21283a83b30050 port 6: PortXmitWait == 8696397
> PortXmitData ==
> > >> 480870802 (1.791GB) PortRcvData == 17067 (66.668KB)
> > >> GUID 0x21283a83b30050 port 7: PortXmitWait == 1129568
> PortXmitData ==
> > >> 126483825 (482.497MB) PortRcvData == 24973385 (95.266MB)
> > >> GUID 0x21283a83b30050 port 8: PortXmitWait == 29021
> PortXmitData ==
> > >> 19444902 (74.176MB) PortRcvData == 84447725 (322.143MB)
> > >> GUID 0x21283a83b30050 port 9: PortXmitWait == 4945130
> PortXmitData ==
> > >> 161911244 (617.642MB) PortRcvData == 27161 (106.098KB)
> > >> GUID 0x21283a83b30050 port 10: PortXmitWait == 16795
> PortXmitData ==
> > >> 35572510 (135.698MB) PortRcvData == 681174731 (2.538GB)
> > >> ... (this goes on for every active ports)
> > >>
> > >> We are not observing any failures, so I suspect that I need
> help to
> > >> interpret these numbers. Do I need to be worried?
> > >>
> > >> Cheers,
> > >> Florent
> > >>
> > >>
> > >> _______________________________________________
> > >> Users mailing list
> > >> Users at lists.openfabrics.org <mailto:Users at lists.openfabrics.org>
> > >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users
> > >>
> > >>
> > >
>
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users
--
---
Mehdi Denou
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20140314/11e59f95/attachment.html>
More information about the Users
mailing list