[Users] how to use sample control for port counters

Hal Rosenstock hal.rosenstock at gmail.com
Wed Jan 13 09:56:21 PST 2016


AFAIR, in perfquery and libibmad, there is only support for reading
PortSamplesControl not writing it. Also, I think reading PortSamplesResult
is missing. The support that's currently there was mainly to get the Tick
value so PortXmitWait could be made more meaningful.

There were issues with the use of the sampling counter(s) so AFAIK no one
uses them. Performance monitoring is done by using the (mostly) free
running (mandatory and if available optional) counters.

-- Hal

On Wed, Jan 13, 2016 at 12:29 PM, Black.S <Black.S52 at yandex.com> wrote:

> 12.01.2016 20:40, Black.S пишет:
>
> 11.01.2016 23:59, Coulter, Susan K пишет:
>
>
> The perfquery command will tell you about the throughput and errors on a
> port.
> The smpquery command will tell you more of what you might want to know
> about a port.
> They come from the infiniband-diags module / RPM.
> Examples are below.
>
> Another option is to use the PerfMgr which is part of the OpenSM code
> base, if you are running osm on a host.
> There is an example of that below too, using the default port/localhost.
>
> [root at mu-master ~]# perfquery 199 1
> # Port counters: Lid 199 port 1 (CapMask: 0x1400)
> PortSelect:......................1
> CounterSelect:...................0x0000
> SymbolErrorCounter:..............0
> LinkErrorRecoveryCounter:........0
> LinkDownedCounter:...............0
> PortRcvErrors:...................0
> PortRcvRemotePhysicalErrors:.....0
> PortRcvSwitchRelayErrors:........0
> PortXmitDiscards:................0
> PortXmitConstraintErrors:........0
> PortRcvConstraintErrors:.........0
> CounterSelect2:..................0x00
> LocalLinkIntegrityErrors:........0
> ExcessiveBufferOverrunErrors:....0
> VL15Dropped:.....................0
> PortXmitData:....................4294967295
> PortRcvData:.....................4294967295
> PortXmitPkts:....................4294967295
> PortRcvPkts:.....................4294967295
> PortXmitWait:....................2503387680
>
>
> [root at mu-master ~]# smpquery pi -L 199
> # Port info: Lid 199 port 0
> Mkey:............................<not displayed>
> GidPrefix:.......................0xfe80000000000000
> Lid:.............................199
> SMLid:...........................250
> CapMask:.........................0x2510868
> IsTrapSupported
> IsAutomaticMigrationSupported
> IsSLMappingSupported
> IsSystemImageGUIDsupported
> IsCommunicatonManagementSupported
> IsVendorClassSupported
> IsCapabilityMaskNoticeSupported
> IsClientRegistrationSupported
> DiagCode:........................0x0000
> MkeyLeasePeriod:.................0
> LocalPort:.......................1
> LinkWidthEnabled:................1X or 4X
> LinkWidthSupported:..............1X or 4X
> LinkWidthActive:.................4X
> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
> LinkState:.......................Active
> PhysLinkState:...................LinkUp
> LinkDownDefState:................Polling
> ProtectBits:.....................2
> LMC:.............................0
> LinkSpeedActive:.................10.0 Gbps
> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
> NeighborMTU:.....................4096
> SMSL:............................0
> VLCap:...........................VL0-3
> InitType:........................0x00
> VLHighLimit:.....................0
> VLArbHighCap:....................8
> VLArbLowCap:.....................8
> InitReply:.......................0x00
> MtuCap:..........................4096
> VLStallCount:....................0
> HoqLife:.........................31
> OperVLs:.........................VL0-3
> PartEnforceInb:..................0
> PartEnforceOutb:.................0
> FilterRawInb:....................0
> FilterRawOutb:...................0
> MkeyViolations:..................0
> PkeyViolations:..................0
> QkeyViolations:..................0
> GuidCap:.........................128
> ClientReregister:................0
> McastPkeyTrapSuppressionEnabled:.0
> SubnetTimeout:...................18
> RespTimeVal:.....................16
> LocalPhysErr:....................8
> OverrunErr:......................8
> MaxCreditHint:...................0
> RoundTrip:.......................0
> CapabilityMask2:.................0x0000
> LinkSpeedExtActive:..............No Extended Speed
> LinkSpeedExtSupported:...........0
> LinkSpeedExtEnabled:.............0
>
>
> [root at mu-master ~]# telnet localhost 10000
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
> OpenSM $ ?
> ? : Command not found
>
> Supported commands and syntax:
> help [<command>]
> quit (not valid in local mode; use ctl-c)
> loglevel [<log-level>]
> permodlog
> priority [<sm-priority>]
> resweep [heavy|light]
> reroute
> sweep [on|off]
> status [loop]
> logflush [on|off] -- toggle opensm.log file flushing
> querylid lid -- print internal information about the lid specified
> portstatus [ca|switch|router]
> switchbalance [verbose] [guid]
> lidbalance [switchguid]
> dump_conf
> update_desc
> version -- print the OSM version
> perfmgr(pm) [enable|disable
>
> |clear_counters|dump_counters|print_counters(pc)|print_errors(pe)
>              |set_rm_nodes|clear_rm_nodes|clear_inactive
>              |set_query_cpi|clear_query_cpi
>              |dump_redir|clear_redir
>              |sweep|sweep_time[seconds]]
> dump_portguid [file filename] regexp1 [regexp2 [regexp3 ...]] -- Dump port
> GUID matching a regexp
>
>
>
> On Dec 26, 2015, at 12:40 PM, Black.S <Black.S52 at yandex.com> wrote:
>
>
>
> Hello all
>
> I want to monitoring ports in IB fabric.
>
> I happened to notice in the out of perfquery some counter like a ticks
> and port sampling. Its will be great for accurate monitoring. But I cant
> found some info about setup and using sampling controi for IB ports.
>
> How to configure and use sampling contol for IB ports? Can anybody give
> me documentations/links or examples? I cant found it In Google.
> Are there any restrictions on the amount of data collected in this
> sampling mode?
>
> If its offtop then redirect me please
>
> Thanks for you time
>
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/users
>
>
> ==================================================
> Susan Coulter
> HPC Network Technical Lead
> (505) 667-8425
> “Once in a while you get shown the light
>     In the strangest of places if you look at it right”  *Robert Hunter*
> ==================================================
>
>
>
>
>
> Hi  Susan Coulter
>
> Very thanks for your reply
>
> Very interesting and especially for perfmon. How I understand the perfmon
> can dump all counters very fast (i hope it more fast then dump it in cycle
> one by one). Thanks for this info.
>
> But nevertheless, can your or somebody explain how to use sampling control
> mechanism for perfquery ?
> *"**perfqery **-c, --smplctl* *show port samples control"*
>
> in this mode, I would like to collect all available counters of the IB
> port with time stamp provided by "tick" counter from the device.
>
> The time stamp form tick counter will allows not depending on time from host
> which collect counters from IB port .
>
> Sorry for my english
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20160113/2d377248/attachment.html>


More information about the Users mailing list