[Users] how to use sample control for port counters

Coulter, Susan K skc at lanl.gov
Mon Jan 11 12:59:21 PST 2016


The perfquery command will tell you about the throughput and errors on a port.
The smpquery command will tell you more of what you might want to know about a port.
They come from the infiniband-diags module / RPM.
Examples are below.

Another option is to use the PerfMgr which is part of the OpenSM code base, if you are running osm on a host.
There is an example of that below too, using the default port/localhost.

[root at mu-master ~]# perfquery 199 1
# Port counters: Lid 199 port 1 (CapMask: 0x1400)
PortSelect:......................1
CounterSelect:...................0x0000
SymbolErrorCounter:..............0
LinkErrorRecoveryCounter:........0
LinkDownedCounter:...............0
PortRcvErrors:...................0
PortRcvRemotePhysicalErrors:.....0
PortRcvSwitchRelayErrors:........0
PortXmitDiscards:................0
PortXmitConstraintErrors:........0
PortRcvConstraintErrors:.........0
CounterSelect2:..................0x00
LocalLinkIntegrityErrors:........0
ExcessiveBufferOverrunErrors:....0
VL15Dropped:.....................0
PortXmitData:....................4294967295
PortRcvData:.....................4294967295
PortXmitPkts:....................4294967295
PortRcvPkts:.....................4294967295
PortXmitWait:....................2503387680


[root at mu-master ~]# smpquery pi -L 199
# Port info: Lid 199 port 0
Mkey:............................<not displayed>
GidPrefix:.......................0xfe80000000000000
Lid:.............................199
SMLid:...........................250
CapMask:.........................0x2510868
IsTrapSupported
IsAutomaticMigrationSupported
IsSLMappingSupported
IsSystemImageGUIDsupported
IsCommunicatonManagementSupported
IsVendorClassSupported
IsCapabilityMaskNoticeSupported
IsClientRegistrationSupported
DiagCode:........................0x0000
MkeyLeasePeriod:.................0
LocalPort:.......................1
LinkWidthEnabled:................1X or 4X
LinkWidthSupported:..............1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkDownDefState:................Polling
ProtectBits:.....................2
LMC:.............................0
LinkSpeedActive:.................10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
NeighborMTU:.....................4096
SMSL:............................0
VLCap:...........................VL0-3
InitType:........................0x00
VLHighLimit:.....................0
VLArbHighCap:....................8
VLArbLowCap:.....................8
InitReply:.......................0x00
MtuCap:..........................4096
VLStallCount:....................0
HoqLife:.........................31
OperVLs:.........................VL0-3
PartEnforceInb:..................0
PartEnforceOutb:.................0
FilterRawInb:....................0
FilterRawOutb:...................0
MkeyViolations:..................0
PkeyViolations:..................0
QkeyViolations:..................0
GuidCap:.........................128
ClientReregister:................0
McastPkeyTrapSuppressionEnabled:.0
SubnetTimeout:...................18
RespTimeVal:.....................16
LocalPhysErr:....................8
OverrunErr:......................8
MaxCreditHint:...................0
RoundTrip:.......................0
CapabilityMask2:.................0x0000
LinkSpeedExtActive:..............No Extended Speed
LinkSpeedExtSupported:...........0
LinkSpeedExtEnabled:.............0


[root at mu-master ~]# telnet localhost 10000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
OpenSM $ ?
? : Command not found

Supported commands and syntax:
help [<command>]
quit (not valid in local mode; use ctl-c)
loglevel [<log-level>]
permodlog
priority [<sm-priority>]
resweep [heavy|light]
reroute
sweep [on|off]
status [loop]
logflush [on|off] -- toggle opensm.log file flushing
querylid lid -- print internal information about the lid specified
portstatus [ca|switch|router]
switchbalance [verbose] [guid]
lidbalance [switchguid]
dump_conf
update_desc
version -- print the OSM version
perfmgr(pm) [enable|disable
             |clear_counters|dump_counters|print_counters(pc)|print_errors(pe)
             |set_rm_nodes|clear_rm_nodes|clear_inactive
             |set_query_cpi|clear_query_cpi
             |dump_redir|clear_redir
             |sweep|sweep_time[seconds]]
dump_portguid [file filename] regexp1 [regexp2 [regexp3 ...]] -- Dump port GUID matching a regexp



On Dec 26, 2015, at 12:40 PM, Black.S <Black.S52 at yandex.com<mailto:Black.S52 at yandex.com>> wrote:



Hello all

I want to monitoring ports in IB fabric.

I happened to notice in the out of perfquery some counter like a ticks
and port sampling. Its will be great for accurate monitoring. But I cant
found some info about setup and using sampling controi for IB ports.

How to configure and use sampling contol for IB ports? Can anybody give
me documentations/links or examples? I cant found it In Google.
Are there any restrictions on the amount of data collected in this
sampling mode?

If its offtop then redirect me please

Thanks for you time

_______________________________________________
Users mailing list
Users at lists.openfabrics.org<mailto:Users at lists.openfabrics.org>
http://lists.openfabrics.org/mailman/listinfo/users

==================================================
Susan Coulter
HPC Network Technical Lead
(505) 667-8425
“Once in a while you get shown the light
    In the strangest of places if you look at it right”  Robert Hunter
==================================================





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20160111/f3533a26/attachment.html>


More information about the Users mailing list