Linux 实时性能测试工具——Cyclictest 的使用与分析

关于Cyclictest工具,在Wiki上有说明:https://rt.wiki.kernel.org/index.php/Cyclictest。下面将对Wiki上的部分说明进行翻译,并结合实际使用来进行分析。


  Cyclictest is a high resolution test program, written by User:Tglx, maintained by Clark Williams and John Kacur

Documentation

Installation

  Get the latest sources from the git repository, do a git clone git://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git or fetch a released tarball from the archive, untar into a directory of your choice and run make in the source directory. If you want to cross compile, just run make CROSS_COMPILE= (for example make CROSS_COMPILE=arm-v4t-linux-gnueabi-).
  You can run the resulting binary from there or install it.

lgs@f11#> git clone git://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git 
lgs@f11#> cd rt-tests
lgs@f11#> make all
lgs@f11#> cp ./cyclictest /usr/bin/
lgs@f11#> cyclictest --help

NOTE!
libnuma is required to build cyclictest. Usually, it’s safe to have libnuma installed also in non-numa systems, but if you don’t want to install the numa libs (e.g. in embedded environment) then compile with make NUMA=0.

Run it

Make sure to be root or use sudo to run cyclictest.
Without parameters cyclictest creates one thread with a 1ms interval timer.
cyclictest -h provides help text for the various options

[lgs@f11 rt-tests]#
[lgs@f11 rt-tests]#
[lgs@f11 rt-tests]# ./cyclictest  --help
cyclictest V 0.42
Usage:
cyclictest <options>

-a [NUM] --affinity        run thread #N on processor #N, if possible
                           with NUM pin all threads to the processor NUM
-b USEC  --breaktrace=USEC send break trace command when latency > USEC
-B       --preemptirqs     both preempt and irqsoff tracing (used with -b)
-c CLOCK --clock=CLOCK     select clock
                           0 = CLOCK_MONOTONIC (default)
                           1 = CLOCK_REALTIME
-C       --context         context switch tracing (used with -b)
-d DIST  --distance=DIST   distance of thread intervals in us default=500
-E       --event           event tracing (used with -b)
-f       --ftrace          function trace (when -b is active)
-i INTV  --interval=INTV   base interval of thread in us default=1000
-I       --irqsoff         Irqsoff tracing (used with -b)
-l LOOPS --loops=LOOPS     number of loops: default=0(endless)
-m       --mlockall        lock current and future memory allocations
-n       --nanosleep       use clock_nanosleep
-N       --nsecs           print results in ns instead of ms (default ms)
-o RED   --oscope=RED      oscilloscope mode, reduce verbose output by RED
-O TOPT  --traceopt=TOPT    trace option
-p PRIO  --prio=PRIO       priority of highest prio thread
-P       --preemptoff      Preempt off tracing (used with -b)
-q       --quiet           print only a summary on exit
-r       --relative        use relative timer instead of absolute
-s       --system          use sys_nanosleep and sys_setitimer
-T TRACE --tracer=TRACER   set tracing function
    configured tracers: unavailable (debugfs not mounted)
-t       --threads         one thread per available processor
-t [NUM] --threads=NUM     number of threads:
                           without NUM, threads = max_cpus
                           without -t default = 1
-v       --verbose         output values on stdout for statistics
                           format: n:c:v n=tasknum c=count v=value in us
-D       --duration=t      specify a length for the test run
                           default is in seconds, but 'm', 'h', or 'd' maybe add
ed
                           to modify value to minutes, hours or days
-h       --histogram=US    dump a latency histogram to stdout after the run
                           US is the max time to be be tracked in microseconds
-w       --wakeup          task wakeup tracing (used with -b)
-W       --wakeuprt        rt task wakeup tracing (used with -b)

-b is a debugging option to control the latency tracer in the realtime preemption patch.
It is useful to track down unexpected large latencies on a system. This option does only work with

  • CONFIG_PREEMPT_RT=y
  • CONFIG_WAKEUP_TIMING=y
  • CONFIG_LATENCY_TRACE=y
  • CONFIG_CRITICAL_PREEMPT_TIMING=y
  • CONFIG_CRITICAL_IRQSOFF_TIMING=y

kernel configuration options enabled. The USEC parameter to the -b option defines a maximum latency value, which is compared against the actual latencies of the test. Once the measured latency is higher than the given maximum, the kernel tracer and cyclictest is stopped. The trace can be read from /proc/latency_trace
mybox# cat /proc/latency_trace >trace.log
Please be aware that the tracer adds significant overhead to the kernel, so the latencies will be much higher than on a kernel with latency tracing disabled.
-c CLOCK selects the clock, which is used

  • 0 selects CLOCK_MONOTONIC, which is the monotonic increasing system
    time. This is the default selection
  • 1 selects CLOCK_REALTIME, which is the time of day time.

CLOCK_REALTIME can be set by settimeofday, while CLOCK_MONOTONIC can not be modified by the user.
This option has no influence when the -s option is given.
-d DIST set the distance of thread intervals in microseconds (default is 500us)
When cylictest is called with the -t option and more than one thread is created, then this distance value is added to the interval of the threads.
Interval(thread N) = Interval(thread N-1) + DIST
-i INTV set the base interval of the thread(s) in microseconds (default is 1000us)
This sets the interval of the first thread. See also -d.
-l LOOPS set the number of loops (default = 0(endless))
This option is useful for automated tests with a given number of test cycles. cyclictest is stopped once the number of timer intervals has been reached.
-n use clock_nanosleep instead of posix interval timers
Setting this option runs the tests with clock_nanosleep instead of posix interval timers.
-p PRIO set the priority of the first thread
The given priority is set to the first test thread. Each further thread gets a lower priority:
Priority(Thread N) = Priority(Thread N-1)
-q run the tests quiet and print only a summary on exit
Useful for automated tests, where only the summary output needs to be captured
-r use relative timers instead of absolute
The default behaviour of the tests is to use absolute timers. This option is there for completeness and should not be used for reproducible tests.
-s use sys_nanosleep and sys_setitimer instead of posix timers
Note, that -s can only be used with one thread because itimers are per process and not per thread. -s in combination with -n uses the nanosleep syscall and is not restricted to one thread
-t NUM set the number of test threads (default is 1), -t without an argument makes the number of threads equal to the number of cpus
Create NUM test threads. See -d, -i and -p for further information.
-v output values on stdout for statistics
This option is used to gather statistical information about the latency distribution. The output is sent to stdout. The output format is
n:c:v
where n=task number c=count v=latency value in us
Use this option in combination with -l
The OSADL Realtime LiveCD project provides a script to plot the latency distribution.

Expected Results

tglx’s reference machine

  All tests have been run on a Pentium III 400MHz based PC.
  The tables show comparisons of vanilla Linux 2.6.16, Linux-2.6.16-hrt5 and Linux-2.6.16-rt12. The tests for intervals less than the jiffy resolution have not been run on vanilla Linux 2.6.16. The test thread runs in all cases with SCHED_FIFO and priority 80. All numbers are in microseconds.

  • Test case: clock_nanosleep(TIME_ABSTIME), Interval 10000
    microseconds,. 10000 loops, no load.

Commandline: cyclictest -t1 -p 80 -n -i 10000 -l 10000
Kernel min max avg
2.6.16 24 4043 1989
2.6.16-hrt5 12 94 20
2.6.16-rt12 6 40 10

  • Test case: clock_nanosleep(TIME_ABSTIME), Interval 10000 micro
    seconds,. 10000 loops, 100% load.

Commandline: cyclictest -t1 -p 80 -n -i 10000 -l 10000
Kernel min max avg
2.6.16 55 4280 2198
2.6.16-hrt5 11 458 55
2.6.16-rt12 6 67 29

  • Test case: POSIX interval timer, Interval 10000 micro seconds,. 10000
    loops, no load.

Commandline: cyclictest -t1 -p 80 -i 10000 -l 10000
Kernel min max avg
2.6.16 21 4073 2098
2.6.16-hrt5 22 120 35
2.6.16-rt12 20 60 31

  • Test case: POSIX interval timer, Interval 10000 micro seconds,. 10000
    loops, 100% load.

Commandline: cyclictest -t1 -p 80 -i 10000 -l 10000
Kernel min max avg
2.6.16 82 4271 2089
2.6.16-hrt5 31 458 53
2.6.16-rt12 21 70 35

  • Test case: clock_nanosleep(TIME_ABSTIME), Interval 500 micro
    seconds,. 100000 loops, no load.

Commandline: cyclictest -t1 -p 80 -i 500 -n -l 100000
Kernel min max avg
2.6.16-hrt5 5 108 24
2.6.16-rt12 5 48 7

  • Test case: clock_nanosleep(TIME_ABSTIME), Interval 500 micro
    seconds,. 100000 loops, 100% load.

Commandline: cyclictest -t1 -p 80 -i 500 -n -l 100000
Kernel min max avg
2.6.16-hrt5 9 684 56
2.6.16-rt12 10 60 22

  • Test case: POSIX interval timer, Interval 500 micro seconds,. 100000
    loops, no load.

Commandline: cyclictest -t1 -p 80 -i 500 -l 100000
Kernel min max avg
2.6.16-hrt5 8 119 22
2.6.16-rt12 12 78 16

  • Test case: POSIX interval timer, Interval 500 micro seconds,. 100000
    loops, 100% load.

Commandline: cyclictest -t1 -p 80 -i 500 -l 100000
Kernel min max avg
2.6.16-hrt5 16 489 58
2.6.16-rt12 12 95 29

FAQ

ps shows the wrong scheduling class SCHED_OTHER

  Each cyclictest-task consist of one or more threads. ps -ce shows only the main-process not the threads of the main-process. ps -eLc | grep cyclic shows the main-process an the containing threads with the correct scheduler class SCHED_FIFO.

#>./cyclictest -t5 -p 80 -n -i 10000

#> ps -cLe | grep cyclic
 4764  4764 TS   19 pts/1    00:00:01 cyclictest
 4764  4765 FF  120 pts/1    00:00:00 cyclictest
 4764  4766 FF  119 pts/1    00:00:00 cyclictest
 4764  4767 FF  118 pts/1    00:00:00 cyclictest
 4764  4768 FF  117 pts/1    00:00:00 cyclictest
 4764  4769 FF  116 pts/1    00:00:00 cyclictest

chrt shows the wrong scheduling class SCHED_OTHER

  Don’t use the PID of the main-process, but the pid of one of the threads from the main-process. The threads are shown with ps -cLe | grep cyclic.

#> chrt -p 4766
pid 4766's current scheduling policy: SCHED_FIFO
pid 4766's current scheduling priority: 79

taskset for CPU affinity

  taskset command is Written by Robert M. Love. SMP operating systems have choices when it comes to scheduling processes: a new or newly rescheduled process can run on any available cpu. However, while it shouldn’t matter where a new process runs, an existing process should go back to the same cpu it was running on simply because the cpu may still be caching data that belongs to that process. This is particularly apt to be true if the process is a thread: the other threads in the same program are very likely to have cpu cache of interest to their brethren (though obviously this also diminishes the performance gain that might be seen from multithreading) . For these reasons, scheduling algorithms pay attention to cpu affinity and try to keep it constant.
  It is possible to force a process to run only on a certain cpu. There are Linux system calls (sched_setaffinity and sched_getaffinity) and a command line “taskset”.

lgs@f11#> taskset -c 3 top
lgs@f11#> taskset -p [pid]

Compile failure because numa.h can’t be found

make
cc -D VERSION_STRING=0.85 -c src/cyclictest/cyclictest.c -Wall -Wno-nonnull -O2 -DNUMA -D_GNU_SOURCE -Isrc/include
In file included from src/cyclictest/cyclictest.c:37:0:
src/cyclictest/rt_numa.h:23:18: fatal error: numa.h: No such file or directory
compilation terminated.
make: *** [cyclictest.o] Error 1

  Simply install your distribution’s numa development package. On Fedora this is numactl-devel, so

su -c 'yum install numactl-devel'

  This is only required for building. This will not affect the way the test runs on non-numa machines

  • 0
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论
### 回答1: Linux性能测试工具包括: 1. sysstat:系统性能监控工具,可以监控CPU、内存、磁盘、网络等性能指标。 2. top:实时监控系统进程和资源占用情况,可以查看CPU、内存、IO等指标。 3. iostat:监控磁盘IO性能,可以查看磁盘读写速度、IO等待时间等指标。 4. vmstat:监控系统虚拟内存使用情况,可以查看内存、CPU、IO等指标。 5. netstat:监控网络连接和流量,可以查看网络连接状态、流量等指标。 6. sar:系统性能分析工具,可以生成系统性能报告,包括CPU、内存、磁盘、网络等指标。 7. perf:Linux性能分析工具,可以分析CPU、内存、磁盘、网络等性能指标,支持多种分析方式。 8. strace:系统调用跟踪工具,可以跟踪进程的系统调用,用于分析进程性能问题。 9. tcpdump:网络抓包工具,可以抓取网络数据包,用于分析网络性能问题。 以上是常用的Linux性能测试工具,可以根据具体需求选择合适的工具。 ### 回答2: Linux性能测试工具主要用于对计算机系统进行性能测试,实时监测系统的各种指标,并给出分析结果。在Linux操作系统中,常用的性能测试工具包括Sysstat工具、Sar工具、Top工具和Iostat工具等。 Sysstat工具是一种常用的Linux性能测试工具,它可以监测CPU利用率、内存使用情况、磁盘读写情况、网络负载、IO负载等关键性能指标。同时,Sysstat工具可以生成日志文件、图表和报告,方便用户进行性能分析和调优。 Sar工具是System Activity Reporter的缩写,它用于监测系统的活动情况,包括CPU、内存、磁盘、网络和进程活动情况等。Sar工具可以输出实时数据和历史数据,并支持各种参数设置和分析方法。 Top工具常用于监测系统的进程情况和资源占用率,包括CPU、内存、IO、网络等方面。Top工具可以快速查看系统状况,方便用户进行系统优化和调试。 Iostat工具用于监测系统的IO负载情况,包括硬盘读写速度、IOPS、吞吐量等指标。Iostat工具可以分析IO瓶颈和优化方案,提高系统IO性能。 总之,在Linux操作系统中选择适当的性能测试工具,对于系统性能分析和优化具有重要的作用,可以帮助用户提高计算机系统的稳定性和工作效率。 ### 回答3: Linux性能测试工具是一类专门用于测试和评估Linux系统性能的软件工具。它们可以对Linux系统的CPU、内存、硬盘等各个方面进行测试,从而帮助用户了解系统的性能瓶颈和优化方向。以下是常用的Linux性能测试工具介绍: 1. UnixBench:UnixBench 是一个通用的Unix性能测试工具,可用于测试CPU,I/O,内存等部件的性能。它是一个开源软件,包括了大量的测试用例。 2. Sysbench:Sysbench 是一个多功能的系统基准测试工具,可以测试CPU,内存,磁盘I/O等性能。它支持多线程和多进程测试,并且可以自动检测性能瓶颈。 3. Iometer:Iometer 是一款专门用于测试磁盘I/O性能的工具,支持以多种模式测试磁盘读写性能,如随机读,随机写等。 4. Perf:Perf 是一个Linux性能分析工具,可以收集CPU使用率,磁盘I/O,内存使用率等系统数据,并生成分析报告,帮助用户分析和优化系统的性能。 5. Fio:Fio 是一款用于测试磁盘性能和I/O负载的工具,支持多种测试模式,如随机读写,顺序读写等。它支持多线程和异步I/O测试。 6. Netperf:Netperf 是一款网络性能测试工具,可以用于测试TCP,UDP等网络协议的性能。它支持多线程测试,可以模拟多用户并发访问。 以上是几种常见的Linux性能测试工具,用户可以根据实际需要选择合适的工具进行系统性能测试和优化。在选择工具时,应该考虑到测试的覆盖范围、测试用例的准确性、测试结果的可靠性等方面,以充分评估系统的性能。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

阿基米东

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值