[Users] Re: Re: how to use sample control for port counters (Black.S)

Black.S Black.S52 at yandex.com
Wed Jan 13 11:55:54 PST 2016


Hi Hal Rosenstock

Very thanks for expanded answer



 > You can read about these counters in the IBA spec volume 1 chapter 
16.1 Performance Management which can be
 > obtained from the IBTA web site (even if your company is not a 
member, all it requires is a login).
>
> There is support missing in perfquery/libibmad but it wouldn't be hard 
> to write verbs program to play with this but there were issues with 
> the use of the sampling counter(s) so AFAIK no one uses them.
>
> -- Hal
>
> On Wed, Jan 13, 2016 at 12:42 PM, Black.S <Black.S52 at yandex.com 
> <mailto:Black.S52 at yandex.com>> wrote:
>
>
>     12.01.2016 20:40, Black.S пишет:
>>     11.01.2016 23:59, Coulter, Susan K пишет:
>>>
>>>     The perfquery command will tell you about the throughput and
>>>     errors on a port.
>>>     The smpquery command will tell you more of what you might want
>>>     to know about a port.
>>>     They come from the infiniband-diags module / RPM.
>>>     Examples are below.
>>>
>>>     Another option is to use the PerfMgr which is part of the OpenSM
>>>     code base, if you are running osm on a host.
>>>     There is an example of that below too, using the default
>>>     port/localhost.
>>>
>>>     [root at mu-master ~]# perfquery 199 1
>>>     # Port counters: Lid 199 port 1 (CapMask: 0x1400)
>>>     PortSelect:......................1
>>>     CounterSelect:...................0x0000
>>>     SymbolErrorCounter:..............0
>>>     LinkErrorRecoveryCounter:........0
>>>     LinkDownedCounter:...............0
>>>     PortRcvErrors:...................0
>>>     PortRcvRemotePhysicalErrors:.....0
>>>     PortRcvSwitchRelayErrors:........0
>>>     PortXmitDiscards:................0
>>>     PortXmitConstraintErrors:........0
>>>     PortRcvConstraintErrors:.........0
>>>     CounterSelect2:..................0x00
>>>     LocalLinkIntegrityErrors:........0
>>>     ExcessiveBufferOverrunErrors:....0
>>>     VL15Dropped:.....................0
>>>     PortXmitData:....................4294967295
>>>     PortRcvData:.....................4294967295
>>>     PortXmitPkts:....................4294967295
>>>     PortRcvPkts:.....................4294967295
>>>     PortXmitWait:....................2503387680 <tel:2503387680>
>>>
>>>
>>>     [root at mu-master ~]# smpquery pi -L 199
>>>     # Port info: Lid 199 port 0
>>>     Mkey:............................<not displayed>
>>>     GidPrefix:.......................0xfe80000000000000
>>>     Lid:.............................199
>>>     SMLid:...........................250
>>>     CapMask:.........................0x2510868
>>>     IsTrapSupported
>>>     IsAutomaticMigrationSupported
>>>     IsSLMappingSupported
>>>     IsSystemImageGUIDsupported
>>>     IsCommunicatonManagementSupported
>>>     IsVendorClassSupported
>>>     IsCapabilityMaskNoticeSupported
>>>     IsClientRegistrationSupported
>>>     DiagCode:........................0x0000
>>>     MkeyLeasePeriod:.................0
>>>     LocalPort:.......................1
>>>     LinkWidthEnabled:................1X or 4X
>>>     LinkWidthSupported:..............1X or 4X
>>>     LinkWidthActive:.................4X
>>>     LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>     LinkState:.......................Active
>>>     PhysLinkState:...................LinkUp
>>>     LinkDownDefState:................Polling
>>>     ProtectBits:.....................2
>>>     LMC:.............................0
>>>     LinkSpeedActive:.................10.0 Gbps
>>>     LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>>>     NeighborMTU:.....................4096
>>>     SMSL:............................0
>>>     VLCap:...........................VL0-3
>>>     InitType:........................0x00
>>>     VLHighLimit:.....................0
>>>     VLArbHighCap:....................8
>>>     VLArbLowCap:.....................8
>>>     InitReply:.......................0x00
>>>     MtuCap:..........................4096
>>>     VLStallCount:....................0
>>>     HoqLife:.........................31
>>>     OperVLs:.........................VL0-3
>>>     PartEnforceInb:..................0
>>>     PartEnforceOutb:.................0
>>>     FilterRawInb:....................0
>>>     FilterRawOutb:...................0
>>>     MkeyViolations:..................0
>>>     PkeyViolations:..................0
>>>     QkeyViolations:..................0
>>>     GuidCap:.........................128
>>>     ClientReregister:................0
>>>     McastPkeyTrapSuppressionEnabled:.0
>>>     SubnetTimeout:...................18
>>>     RespTimeVal:.....................16
>>>     LocalPhysErr:....................8
>>>     OverrunErr:......................8
>>>     MaxCreditHint:...................0
>>>     RoundTrip:.......................0
>>>     CapabilityMask2:.................0x0000
>>>     LinkSpeedExtActive:..............No Extended Speed
>>>     LinkSpeedExtSupported:...........0
>>>     LinkSpeedExtEnabled:.............0
>>>
>>>
>>>     [root at mu-master ~]# telnet localhost 10000
>>>     Trying 127.0.0.1...
>>>     Connected to localhost.
>>>     Escape character is '^]'.
>>>     OpenSM $ ?
>>>     ? : Command not found
>>>
>>>     Supported commands and syntax:
>>>     help [<command>]
>>>     quit (not valid in local mode; use ctl-c)
>>>     loglevel [<log-level>]
>>>     permodlog
>>>     priority [<sm-priority>]
>>>     resweep [heavy|light]
>>>     reroute
>>>     sweep [on|off]
>>>     status [loop]
>>>     logflush [on|off] -- toggle opensm.log file flushing
>>>     querylid lid -- print internal information about the lid specified
>>>     portstatus [ca|switch|router]
>>>     switchbalance [verbose] [guid]
>>>     lidbalance [switchguid]
>>>     dump_conf
>>>     update_desc
>>>     version -- print the OSM version
>>>     perfmgr(pm) [enable|disable
>>>     |clear_counters|dump_counters|print_counters(pc)|print_errors(pe)
>>>     |set_rm_nodes|clear_rm_nodes|clear_inactive
>>>     |set_query_cpi|clear_query_cpi
>>>     |dump_redir|clear_redir
>>>     |sweep|sweep_time[seconds]]
>>>     dump_portguid [file filename] regexp1 [regexp2 [regexp3 ...]] --
>>>     Dump port GUID matching a regexp
>>>
>>>
>>>
>>>>     On Dec 26, 2015, at 12:40 PM, Black.S <Black.S52 at yandex.com
>>>>     <mailto:Black.S52 at yandex.com>> wrote:
>>>>
>>>>
>>>>
>>>>     Hello all
>>>>
>>>>     I want to monitoring ports in IB fabric.
>>>>
>>>>     I happened to notice in the out of perfquery some counter like
>>>>     a ticks
>>>>     and port sampling. Its will be great for accurate monitoring.
>>>>     But I cant
>>>>     found some info about setup and using sampling controi for IB
>>>>     ports.
>>>>
>>>>     How to configure and use sampling contol for IB ports? Can
>>>>     anybody give
>>>>     me documentations/links or examples? I cant found it In Google.
>>>>     Are there any restrictions on the amount of data collected in this
>>>>     sampling mode?
>>>>
>>>>     If its offtop then redirect me please
>>>>
>>>>     Thanks for you time
>>>>
>>>>     _______________________________________________
>>>>     Users mailing list
>>>>     Users at lists.openfabrics.org <mailto:Users at lists.openfabrics.org>
>>>>     http://lists.openfabrics.org/mailman/listinfo/users
>>>
>>>     ==================================================
>>>     Susan Coulter
>>>     HPC Network Technical Lead
>>>     (505) 667-8425 <tel:%28505%29%20667-8425>
>>>     “Once in a while you get shown the light
>>>         In the strangest of places if you look at it right” /Robert
>>>     Hunter/
>>>     ==================================================
>>>
>>>
>>>
>>>
>>>
>     Hi  Susan Coulter
>
>     Very thanks for your reply
>
>     Very interesting and especially for perfmon. How I understand the
>     perfmon can dump all counters very fast (i hope it more fast then
>     dump it in cycle one by one). Thanks for this info.
>
>     But nevertheless, can your or somebody explain how to use sampling
>     control mechanism for perfquery ?
>
>     /"//perfqery //-c, --smplctl////show port samples control"
>
>     /in this mode, I would like to collect all available counters of
>     the IB port with time stamp provided by "tick" counter from the
>     device.
>
>     The time stamp form tick counter will allows not depending on time
>     from host which collect counters from IB port .
>
>     Sorry I am noobe in maillist
>
>
>
>     _______________________________________________
>     Users mailing list
>     Users at lists.openfabrics.org <mailto:Users at lists.openfabrics.org>
>     http://lists.openfabrics.org/mailman/listinfo/users
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20160113/851bf7c4/attachment.html>


More information about the Users mailing list