[Users] Re: how to use sample control for port counters (Black.S)

Black.S Black.S52 at yandex.com
Wed Jan 13 09:42:43 PST 2016


12.01.2016 20:40, Black.S пишет:
> 11.01.2016 23:59, Coulter, Susan K пишет:
>>
>> The perfquery command will tell you about the throughput and errors 
>> on a port.
>> The smpquery command will tell you more of what you might want to 
>> know about a port.
>> They come from the infiniband-diags module / RPM.
>> Examples are below.
>>
>> Another option is to use the PerfMgr which is part of the OpenSM code 
>> base, if you are running osm on a host.
>> There is an example of that below too, using the default port/localhost.
>>
>> [root at mu-master ~]# perfquery 199 1
>> # Port counters: Lid 199 port 1 (CapMask: 0x1400)
>> PortSelect:......................1
>> CounterSelect:...................0x0000
>> SymbolErrorCounter:..............0
>> LinkErrorRecoveryCounter:........0
>> LinkDownedCounter:...............0
>> PortRcvErrors:...................0
>> PortRcvRemotePhysicalErrors:.....0
>> PortRcvSwitchRelayErrors:........0
>> PortXmitDiscards:................0
>> PortXmitConstraintErrors:........0
>> PortRcvConstraintErrors:.........0
>> CounterSelect2:..................0x00
>> LocalLinkIntegrityErrors:........0
>> ExcessiveBufferOverrunErrors:....0
>> VL15Dropped:.....................0
>> PortXmitData:....................4294967295
>> PortRcvData:.....................4294967295
>> PortXmitPkts:....................4294967295
>> PortRcvPkts:.....................4294967295
>> PortXmitWait:....................2503387680
>>
>>
>> [root at mu-master ~]# smpquery pi -L 199
>> # Port info: Lid 199 port 0
>> Mkey:............................<not displayed>
>> GidPrefix:.......................0xfe80000000000000
>> Lid:.............................199
>> SMLid:...........................250
>> CapMask:.........................0x2510868
>> IsTrapSupported
>> IsAutomaticMigrationSupported
>> IsSLMappingSupported
>> IsSystemImageGUIDsupported
>> IsCommunicatonManagementSupported
>> IsVendorClassSupported
>> IsCapabilityMaskNoticeSupported
>> IsClientRegistrationSupported
>> DiagCode:........................0x0000
>> MkeyLeasePeriod:.................0
>> LocalPort:.......................1
>> LinkWidthEnabled:................1X or 4X
>> LinkWidthSupported:..............1X or 4X
>> LinkWidthActive:.................4X
>> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>> LinkState:.......................Active
>> PhysLinkState:...................LinkUp
>> LinkDownDefState:................Polling
>> ProtectBits:.....................2
>> LMC:.............................0
>> LinkSpeedActive:.................10.0 Gbps
>> LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
>> NeighborMTU:.....................4096
>> SMSL:............................0
>> VLCap:...........................VL0-3
>> InitType:........................0x00
>> VLHighLimit:.....................0
>> VLArbHighCap:....................8
>> VLArbLowCap:.....................8
>> InitReply:.......................0x00
>> MtuCap:..........................4096
>> VLStallCount:....................0
>> HoqLife:.........................31
>> OperVLs:.........................VL0-3
>> PartEnforceInb:..................0
>> PartEnforceOutb:.................0
>> FilterRawInb:....................0
>> FilterRawOutb:...................0
>> MkeyViolations:..................0
>> PkeyViolations:..................0
>> QkeyViolations:..................0
>> GuidCap:.........................128
>> ClientReregister:................0
>> McastPkeyTrapSuppressionEnabled:.0
>> SubnetTimeout:...................18
>> RespTimeVal:.....................16
>> LocalPhysErr:....................8
>> OverrunErr:......................8
>> MaxCreditHint:...................0
>> RoundTrip:.......................0
>> CapabilityMask2:.................0x0000
>> LinkSpeedExtActive:..............No Extended Speed
>> LinkSpeedExtSupported:...........0
>> LinkSpeedExtEnabled:.............0
>>
>>
>> [root at mu-master ~]# telnet localhost 10000
>> Trying 127.0.0.1...
>> Connected to localhost.
>> Escape character is '^]'.
>> OpenSM $ ?
>> ? : Command not found
>>
>> Supported commands and syntax:
>> help [<command>]
>> quit (not valid in local mode; use ctl-c)
>> loglevel [<log-level>]
>> permodlog
>> priority [<sm-priority>]
>> resweep [heavy|light]
>> reroute
>> sweep [on|off]
>> status [loop]
>> logflush [on|off] -- toggle opensm.log file flushing
>> querylid lid -- print internal information about the lid specified
>> portstatus [ca|switch|router]
>> switchbalance [verbose] [guid]
>> lidbalance [switchguid]
>> dump_conf
>> update_desc
>> version -- print the OSM version
>> perfmgr(pm) [enable|disable
>> |clear_counters|dump_counters|print_counters(pc)|print_errors(pe)
>> |set_rm_nodes|clear_rm_nodes|clear_inactive
>> |set_query_cpi|clear_query_cpi
>> |dump_redir|clear_redir
>> |sweep|sweep_time[seconds]]
>> dump_portguid [file filename] regexp1 [regexp2 [regexp3 ...]] -- Dump 
>> port GUID matching a regexp
>>
>>
>>
>>> On Dec 26, 2015, at 12:40 PM, Black.S <Black.S52 at yandex.com> wrote:
>>>
>>>
>>>
>>> Hello all
>>>
>>> I want to monitoring ports in IB fabric.
>>>
>>> I happened to notice in the out of perfquery some counter like a ticks
>>> and port sampling. Its will be great for accurate monitoring. But I cant
>>> found some info about setup and using sampling controi for IB ports.
>>>
>>> How to configure and use sampling contol for IB ports? Can anybody give
>>> me documentations/links or examples? I cant found it In Google.
>>> Are there any restrictions on the amount of data collected in this
>>> sampling mode?
>>>
>>> If its offtop then redirect me please
>>>
>>> Thanks for you time
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.openfabrics.org <mailto:Users at lists.openfabrics.org>
>>> http://lists.openfabrics.org/mailman/listinfo/users
>>
>> ==================================================
>> Susan Coulter
>> HPC Network Technical Lead
>> (505) 667-8425
>> “Once in a while you get shown the light
>>     In the strangest of places if you look at it right” /Robert Hunter/
>> ==================================================
>>
>>
>>
>>
>>
Hi  Susan Coulter

Very thanks for your reply

Very interesting and especially for perfmon. How I understand the 
perfmon can dump all counters very fast (i hope it more fast then dump 
it in cycle one by one). Thanks for this info.

But nevertheless, can your or somebody explain how to use sampling 
control mechanism for perfquery ?

/"//perfqery //-c, --smplctl////show port samples control"

/in this mode, I would like to collect all available counters of the IB 
port with time stamp provided by "tick" counter from the device.

The time stamp form tick counter will allows not depending on time from 
host which collect counters from IB port .

Sorry I am noobe in maillist


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20160113/78985be4/attachment.html>


More information about the Users mailing list