[Users] Re: how to use sample control for port counters (Black.S)
Black.S
Black.S52 at yandex.com
Wed Jan 13 09:42:43 PST 2016
12.01.2016 20:40, Black.S пишет:
> 11.01.2016 23:59, Coulter, Susan K пишет:
>>
>> The perfquery command will tell you about the throughput and errors
>> on a port.
>> The smpquery command will tell you more of what you might want to
>> know about a port.
>> They come from the infiniband-diags module / RPM.
>> Examples are below.
>>
>> Another option is to use the PerfMgr which is part of the OpenSM code
>> base, if you are running osm on a host.
>> There is an example of that below too, using the default port/localhost.
>>
>> [root at mu-master ~]# perfquery 199 1
>> # Port counters: Lid 199 port 1 (CapMask: 0x1400)
>> PortSelect:......................1
>> CounterSelect:...................0x0000
>> SymbolErrorCounter:..............0
>> LinkErrorRecoveryCounter:........0
>> LinkDownedCounter:...............0
>> PortRcvErrors:...................0
>> PortRcvRemotePhysicalErrors:.....0
>> PortRcvSwitchRelayErrors:........0
>> PortXmitDiscards:................0
>> PortXmitConstraintErrors:........0
>> PortRcvConstraintErrors:.........0
>> CounterSelect2:..................0x00
>> LocalLinkIntegrityErrors:........0
>> ExcessiveBufferOverrunErrors:....0
>> VL15Dropped:.....................0
>> PortXmitData:....................4294967295
>> PortRcvData:.....................4294967295
>> PortXmitPkts:....................4294967295
>> PortRcvPkts:.....................4294967295
>> PortXmitWait:....................2503387680
>>
>>
>> [root at mu-master ~]# smpquery pi -L 199
>> # Port info: Lid 199 port 0
>> Mkey:............................<not displayed>
>> GidPrefix:.......................0xfe80000000000000
>> Lid:.............................199
>> SMLid:...........................250
>> CapMask:.........................0x2510868
>> IsTrapSupported
>> IsAutomaticMigrationSupported
>> IsSLMappingSupported
>> IsSystemImageGUIDsupported
>> IsCommunicatonManagementSupported
>> IsVendorClassSupported
>> IsCapabilityMaskNoticeSupported
>> IsClientRegistrationSupported
>> DiagCode:........................0x0000
>> MkeyLeasePeriod:.................0
>> LocalPort:.......................1
>> LinkWidthEnabled:................1X or 4X
>> LinkWidthSupported:..............1X or 4X
>> LinkWidthActive:.................4X
>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>> LinkState:.......................Active
>> PhysLinkState:...................LinkUp
>> LinkDownDefState:................Polling
>> ProtectBits:.....................2
>> LMC:.............................0
>> LinkSpeedActive:.................10.0 Gbps
>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>> NeighborMTU:.....................4096
>> SMSL:............................0
>> VLCap:...........................VL0-3
>> InitType:........................0x00
>> VLHighLimit:.....................0
>> VLArbHighCap:....................8
>> VLArbLowCap:.....................8
>> InitReply:.......................0x00
>> MtuCap:..........................4096
>> VLStallCount:....................0
>> HoqLife:.........................31
>> OperVLs:.........................VL0-3
>> PartEnforceInb:..................0
>> PartEnforceOutb:.................0
>> FilterRawInb:....................0
>> FilterRawOutb:...................0
>> MkeyViolations:..................0
>> PkeyViolations:..................0
>> QkeyViolations:..................0
>> GuidCap:.........................128
>> ClientReregister:................0
>> McastPkeyTrapSuppressionEnabled:.0
>> SubnetTimeout:...................18
>> RespTimeVal:.....................16
>> LocalPhysErr:....................8
>> OverrunErr:......................8
>> MaxCreditHint:...................0
>> RoundTrip:.......................0
>> CapabilityMask2:.................0x0000
>> LinkSpeedExtActive:..............No Extended Speed
>> LinkSpeedExtSupported:...........0
>> LinkSpeedExtEnabled:.............0
>>
>>
>> [root at mu-master ~]# telnet localhost 10000
>> Trying 127.0.0.1...
>> Connected to localhost.
>> Escape character is '^]'.
>> OpenSM $ ?
>> ? : Command not found
>>
>> Supported commands and syntax:
>> help [<command>]
>> quit (not valid in local mode; use ctl-c)
>> loglevel [<log-level>]
>> permodlog
>> priority [<sm-priority>]
>> resweep [heavy|light]
>> reroute
>> sweep [on|off]
>> status [loop]
>> logflush [on|off] -- toggle opensm.log file flushing
>> querylid lid -- print internal information about the lid specified
>> portstatus [ca|switch|router]
>> switchbalance [verbose] [guid]
>> lidbalance [switchguid]
>> dump_conf
>> update_desc
>> version -- print the OSM version
>> perfmgr(pm) [enable|disable
>> |clear_counters|dump_counters|print_counters(pc)|print_errors(pe)
>> |set_rm_nodes|clear_rm_nodes|clear_inactive
>> |set_query_cpi|clear_query_cpi
>> |dump_redir|clear_redir
>> |sweep|sweep_time[seconds]]
>> dump_portguid [file filename] regexp1 [regexp2 [regexp3 ...]] -- Dump
>> port GUID matching a regexp
>>
>>
>>
>>> On Dec 26, 2015, at 12:40 PM, Black.S <Black.S52 at yandex.com> wrote:
>>>
>>>
>>>
>>> Hello all
>>>
>>> I want to monitoring ports in IB fabric.
>>>
>>> I happened to notice in the out of perfquery some counter like a ticks
>>> and port sampling. Its will be great for accurate monitoring. But I cant
>>> found some info about setup and using sampling controi for IB ports.
>>>
>>> How to configure and use sampling contol for IB ports? Can anybody give
>>> me documentations/links or examples? I cant found it In Google.
>>> Are there any restrictions on the amount of data collected in this
>>> sampling mode?
>>>
>>> If its offtop then redirect me please
>>>
>>> Thanks for you time
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.openfabrics.org <mailto:Users at lists.openfabrics.org>
>>> http://lists.openfabrics.org/mailman/listinfo/users
>>
>> ==================================================
>> Susan Coulter
>> HPC Network Technical Lead
>> (505) 667-8425
>> “Once in a while you get shown the light
>> In the strangest of places if you look at it right” /Robert Hunter/
>> ==================================================
>>
>>
>>
>>
>>
Hi Susan Coulter
Very thanks for your reply
Very interesting and especially for perfmon. How I understand the
perfmon can dump all counters very fast (i hope it more fast then dump
it in cycle one by one). Thanks for this info.
But nevertheless, can your or somebody explain how to use sampling
control mechanism for perfquery ?
/"//perfqery //-c, --smplctl////show port samples control"
/in this mode, I would like to collect all available counters of the IB
port with time stamp provided by "tick" counter from the device.
The time stamp form tick counter will allows not depending on time from
host which collect counters from IB port .
Sorry I am noobe in maillist
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20160113/78985be4/attachment.html>
More information about the Users
mailing list