[Users] PortXmitWait?

Florent Parent florent.parent at calculquebec.ca
Wed Mar 12 17:17:34 PDT 2014


Hello IB users,

We recently migrated our opensm from 3.2.6 to 3.3.17. In this upgrade, we
moved to CentOS6.5 with the stock RDMA and infiniband-diags_1.5.12-5., and
running opensm 3.3.17. Routing is FatTree:
General fabric topology info
============================
- FatTree rank (roots to leaf switches): 3
- FatTree max switch rank: 2
- Fabric has 966 CAs, 966 CA ports (603 of them CNs), 186 switches
- Fabric has 36 switches at rank 0 (roots)
- Fabric has 64 switches at rank 1
- Fabric has 86 switches at rank 2 (86 of them leafs)

Now to the question: ibqueryerrors 1.5.12 is reporting high PortXmitWait
values throughout the fabric. We did not see this counter before (it was
not reported by the older ibqueryerrors.pl)

To give an idea of the scale of the counters, here's a capture of
ibqueryerrors --data on one specific I4 switch, 10 seconds after clearing
the counters (-k -K):

GUID 0x21283a83b30050 port 4:  PortXmitWait == 2932676  PortXmitData ==
90419517 (344.923MB)  PortRcvData == 1526963011 (5.688GB)
GUID 0x21283a83b30050 port 5:  PortXmitWait == 3110105  PortXmitData ==
509580912 (1.898GB)  PortRcvData == 13622 (53.211KB)
GUID 0x21283a83b30050 port 6:  PortXmitWait == 8696397  PortXmitData ==
480870802 (1.791GB)  PortRcvData == 17067 (66.668KB)
GUID 0x21283a83b30050 port 7:  PortXmitWait == 1129568  PortXmitData ==
126483825 (482.497MB)  PortRcvData == 24973385 (95.266MB)
GUID 0x21283a83b30050 port 8:  PortXmitWait == 29021  PortXmitData ==
19444902 (74.176MB)  PortRcvData == 84447725 (322.143MB)
GUID 0x21283a83b30050 port 9:  PortXmitWait == 4945130  PortXmitData ==
161911244 (617.642MB)  PortRcvData == 27161 (106.098KB)
GUID 0x21283a83b30050 port 10:  PortXmitWait == 16795  PortXmitData ==
35572510 (135.698MB)  PortRcvData == 681174731 (2.538GB)
... (this goes on for every active ports)

We are not observing any failures, so I suspect that I need help to
interpret these numbers. Do I need to be worried?

Cheers,
Florent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20140312/09116a4a/attachment.html>


More information about the Users mailing list