Vtune分析Android,Intel VTune分析结果中的名词释译

最新推荐文章于 2022-07-22 19:13:00 发布

KyrieXu11

最新推荐文章于 2022-07-22 19:13:00 发布

阅读量311

点赞数

文章标签： Vtune分析Android

本文详细解读了程序执行过程中关键性能指标，如ElapsedTime（耗时）、CPUTime（CPU工作时间）、InstructionsRetired（指令执行）和SynchronizationContextSwitches（同步上下文切换），揭示了如何通过这些指标优化代码性能和线程调度。

摘要由CSDN通过智能技术生成

Elapsed

Time(执行耗时)：

the total time your target ran, is calculated as

follows:

Wall clock time at end of application – Wall clock

time at start of application

应用程序的整个的运行时间，等于”程序结束时间减程序开始时间”。

CPU

Time:

Active processor Self time spent in the function.

For multiple threads, CPU time is summed up. By default, the Self time is

provided in seconds. The blue bar is a visual indicator of the CPU time usage.

The longer the bar, the higher the value.

处理器工作的时间。

对于多线程来说，CPU时间是累加的。默认地，它是以秒为单位。

分析结果中的蓝色条指示的是CPU的使用时间，条越长，值越大。

In the Summary window, CPU time is the overall time

that all processors spent working for the application. If there are multiple

cores then the times are added. For example, if core 1 spends 4 seconds working

for the application and core 2 spends 7 seconds then the CPU time will be 11

seconds. The CPU time can be greater than the Elapsed time. The upper bound for

CPU time is Elapsed time * number of logical cores.

在总结窗口中，CPU时间是所有处理器花费在程序处理上的时间。

如果是多核，这个时间是累加的。

如果一个核花费了4秒用于工作，而另一个核花了7秒用于这个工作，那么CPU时间就是11秒。

所以CPU时间可以大于耗时。

CPU时间的上限是耗时*逻辑核个数。

Instructions

Retired:

Modern processors execute much more instructions

that the program flow needs. This is

called"speculative execution". Then the

instructions that were "proven" as indeed

neededby flow are"retired". You can think about

"retired" instuctions as only

instructions needed by the program

flow.

现代处理器是以”预测执行”的方式执行指令，也就是说会执行比程序流程本身要多很多的指令，

如分支预判，提前计算等。而最后程序流程真正执行了的指令就是”retired”。可以认为

”retired”的指令是程序流真正执行了的指令。

I guess "retired instructions" means those

instructions that are acturally executed and completed by CPU. The CPU some

kind of prediction about the instructions to be excuted and put them into some

place like a "pool". But not all of these instructions will be

excuted.

CPI

Rate：

Clockticks per Instructions Retired (CPI) event

ratio, also known as Cycles per Instructions, is one of the basic performance

metrics for thecollection. This ratio is

calculated as Clockticks / Instructions Retired.

每条指令执行完成的时钟周期。

When you want to determine where to focus your

performance tuning effort, the CPI is the first metric to check. A good CPI

rate indicates that the code is executing optimally.

As a general guide these numbers have been derived

from experienced performance engineers:

Good

Poor0.75

A high value for this ratio indicates that over the

current code region, instructions are taking a high number of processor clocks

to execute. This could indicate a problem if most of the instructions are not

predominately high latency instructions and/or coming from microcode ROM. In

this case there may be opportunities to modify your code to improve the

efficiency with which instructions are executed within the processor.

现在的处理器通常是一个周期内可以执行多条指令。

例如，一个时钟周期内执行四条指令，那么CPI的最佳理论值为0.25。

如果这个值很高，则表示当前的代码块的指令使用了多个时钟周期来执行指令，如果这些指令不是高延迟指令，则表明这段代码是有问题。这时可以通过优化这个地方来提高效率。

Synchronization

Context Switches(同步上下文切换):

Number of times a thread was switched off

a processor because of making an explicit

call to threadsynchronization API. For example, in case of trying

to wait on a

synchronization object already occupied by another thread, the

number of synchronization

context switches will characterize the level of

contention between threads.

因为显示调用线程同步API而导致的线程关闭处理的次数。

例如，尝试等待被其它线程占用的同步对象，同步上下文切换的次数反映了线程间竞争的等级。

Wait

Count:

Number

of times the corresponding system wait API was called. For a lock, it is the

number of times the lock was contended and caused a wait.

系统wait API被调用的次数。

对于一个锁来说，指的是因为竞争锁而导致等待的次数。

Wait

Rate:

Average Wait time (in milliseconds) per . Low metric

value may signalan increased contention between threads and

inefficient use of

system API.

每次同步上下文切换的平均等待时间(以毫秒为单位)。

该值较小表示的是线程间竞争的增加和系统API的低效使用

Estimated

Call Count:

Statistical estimation of call counts

based on hardware events.

基于硬件事件的调用次数的统计估计值。

Wait

Time:

Duration of a thread inactivity due to

contended synchronization.

由于竞争同步导致的线程不活动的时间。

Inactive

Time:

Time during which a thread remained preempted from

execution. Note that many threads can be inactive at any given point in time,

so the sum of Wait and Inactive times of those threads can be much greater than

the Total time of program execution.

线程执行时保持被其它线程抢先占用的时间。

在一个给定的时间点，会有很多线程是不活动的，因此，线程等待和不活动的时间之和会远大于程序执行的总时长。

Overhead

Time:

Duration that starts with the release of a shared

resource and ends with the receipt of that resource. Ideally, the duration of

overhead time is very short because it reduces the time a thread has to wait to

acquire a resource.

从共享资源被释放开始，到收到这个资源的时间。

这个时间应该很短，因为它是减少线程获得资源的等待时间。

Spin

Time(轮询时间):

Wait Time during which the CPU is busy. This often

occurs when a synchronization API causes the CPU to poll while the software

thread is waiting. Some Spin Time may be preferable to the alternative of

increased thread context switches. Too much Spin Time, however, can reflect

lost opportunity for productive work.

因为CPU忙而导致的等待时间。

当同步API造成CPU轮询(因为软件线程在等待)时，这时就会发生轮询。

有些轮询时间会有利于增加线程上下文切换的选择。

然而，太多的轮询时间，反映了有效工作的不充分。

Idle Time:

Duration while a thread remained inactive (for any

reason) and the system did not have any other task to execute (was idle). The

Idle time is always less than any of the Wait and Inactive time.

当一个线程保持不活动且系统没有其它任务在执行的时间。

闲置时间总是小于等待和不活动时间。

KyrieXu11

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫