《性能之巅第2版》阅读笔记(二)--性能观察工具

《System Performance: Enterprise and the Cloud, 2nd Edition (2020)》阅读笔记简要记录

第四章 观察工具

4. 观察工具

4.1 tool coverage

工具功能总览图:
请添加图片描述

4.1.1 static performance tools

请添加图片描述

4.1.2 crisis tools
4.2 tools type

请添加图片描述

4.2.1 计数器类型

Kernels maintain various counters for providing system statistics. They are usually implemented
as unsigned integers that are incremented when events occur.

系统级别的工具

vmstatVirtual and physical memory statistics, system-wide
mpstatPer-CPU usage
iostatPer-disk I/O usage, reported from the block device interface
nstatTCP/IP stack statistics
sarVarious statistics; can also archive them for historical reporting

进程级别的工具

psShows process status, shows various process statistics, including memory and CPU usage
topShows top processes, sorted by CPU usage or another statistic.
pmapLists process memory segments with usage statistics
4.2.2 分析类型

系统级别

perfThe standard Linux profiler, which includes profiling subcommands
profileA BPF-based CPU profiler from the BCC repository (covered in Chapter 15, BPF) that frequency counts stack traces in kernel context
Intel VTune Amplifier XELinux and Windows profiling, with a graphical interface including source browsing

进程级别

gprofThe GNU profiling tool, which analyzes profiling information added by compilers (e.g., gcc -pg).
cachegrindA tool from the valgrind toolkit, can profile hardware cache usage (and more) and visualize profiles using kcachegrind
Java Flight Recorder(JER)Programming languages often have their own special-purpose profilers that can inspect language context. For example, JFR for Java
4.2.3 追踪工具

系统级别

tcpdump抓包工具
biosnoopBlock I/O tracing (uses BCC or bpftrace)
execsnoopNew processes tracing (uses BCC or bpftrace)
perfThe standard Linux profiler, can also trace events
perf traceA special perf subcommand that traces system calls system-wide
ftraceThe Linux built-in tracer
BCCA BPF-based tracing library and toolkit
bpftraceA BPF-based tracer (bpftrace(8)) and toolkit

进程级别

straceSystem call tracing
gdbA source-level debugger
4.2.4 监控

monitor工具一般记录保存statistics,以便分析使用。

sarCollect, report, or save system activity information
snmpDevices and operating systems can support SNMP and in some cases provide it by default, avoiding the need to install third-party agents or exporters
agents
4.3 观察的资源

linux可供观测的资源,最主要的来源就是/proc/sys两个目录。

image-20201227152620579

linux跟踪资源汇总
请添加图片描述

4.3.1 /proc文件系统

/proc is dynamically created by the kernel and is not backed by storage devices (it runs inmemory). It is mostly read-only, providing statistics for observability tools. Some files are writeable, for controlling process and kernel behavior.

进程级别的statistics

请添加图片描述

limits实际资源限制
maps映射内存区域
schedCPU调度器的统计数据
schedstat获取到CPU运行时间、延时和时间片(runtime、latency、time slice)
smaps映射内存区域的使用统计
stat进程状态和统计数据,包括总体CPU和内存使用情况
statm以page为单位的内存使用统计
statusstat和statm的信息,用户可读
fd(打开的)文件符号链接目录
cgroupcgroup组员信息
task每个线程的详细数据

系统相关的statistics
请添加图片描述

cpuinfo物理处理器信息,包括每个虚拟CPU、厂商名、时钟速率、缓存大小
diskstats所有的磁盘的I/O统计数据
interrupts每个CPU的中断统计
loadavg负载平均值
meminfo系统内存使用情况breakdown
net/dev网络接口汇总
net/netstat系统级networking数据统计
net/tcp活动的TCP套接字信息
pressurePressure stall information (PSI) files;cpu、io、memory的压力阻塞记录,分析比如OOM问题
schedstat系统级别的CPU调度统计
self当前进程的符号链接
slabinfo内核slab缓存分配使用情况
stat内核和系统的资源统计汇总:CPUs、磁盘、页表、swap、进程
zoneinfomemory zone信息
4.3.2 /sys文件系统

不同于/proc系统,/sys一开始是为统计device driver statistics设计的,不过现在也发展到全面统计数据。

4.3.4 延时核算

内核开启CONFIG_TASK_DELAY_ACCT后,就会为每个任务统计以下数据:

  • Scheduler latency: 调度延时,等待获取到CPU的时间
  • Block I/O:块I/O,等待块I/O完成
  • Swapping:交换,等待换页(内存压力)
  • Memory reclaim:内存回收,等待内存回收例程

内核Documentation/accounting/delay-accounting.txt中帮助文档,且有个例子tools/accounting/getdelays.c

请添加图片描述

这是在一个高负载的系统上采集的数据,CPU延时很严重。

4.3.4 netlink

netlink机制,用户态和内核态通信的方法之一,genetlink更方便扩展。

4.3.5 tracepoints

Tracepoints are hard-coded instrumentation points placed at logical locations in kernel code。

举例:

在系统调用的start和end处、调度事件、文件系统操作、以及磁盘I/O等地方都有tarcepoints。有些tracepoint需要开启内核支持,比如CONFIG_RCU_TRACE用于支持rcu tracepoints。

tracepoint overhead(跟踪点的开销)

激活了tracepoints后,会增大CPU开销、文件记录操作开销等,这些额外的开销是否干扰到测试关心的性能数据,具体情况具体分析。

4.3.6 kprobes

kprobes (short for kernel probes) is a Linux kernel event source for tracers based on dynamic instrumentation。

kprobes可以跟踪任一内核函数或指令。

kprobe如何使用:标准做法是在正在运行的内核代码中修改指令以插入我们想要的监测点;测量函数入口时可以使用已有的ftrace功能,减少额外overhead开销。

kprobes和tracepoints对比:

请添加图片描述

kprobe可观察函数入参,kretprobes观察函数返回值

4.3.7 uprobes

uprobes (user-space probes) are similar to kprobes, but for user-space.

4.3.8 USDT

User-level statically-defined tracing (USDT) is the user-space version of tracepoints

4.3.9 Hardware Counters(PMCs)

The processor and other devices commonly support hardware counters for observing activity. The main source are the processors, where they are commonly called performance monitoring counters (PMCs). They are known by other names as well: CPU performance counters (CPCs), performance instrumentation counters (PICs), and performance monitoring unit events (PMU events). These all refer to the same thing: programmable hardware registers on the processor that provide low-level performance information at the CPU cycle level.

处理器上的可编程硬件寄存器,可提供CPU循环级别的系统性能信息;

PMC面临的挑战:

  • 溢流式采样的精度问题
  • 云环境中的可用性问题
4.3.10 其他观测资源

MSR: model-specific registers;

ptrace:系统调用,被gdb用于调试,被strace用于跟踪

netfilter conntrack:netfilter连接跟踪机制;

4.4 sar工具
# 如何开启sysstat统计?
# vi /etc/default/sysstat
Enable="true"
root@ubuntu:~# sar -u -n TCP 3 3
Linux 5.4.0-58-generic (ubuntu) 	2020年12月28日 	_x86_64_	(2 CPU)

18时05分02秒     CPU     %user     %nice   %system   %iowait    %steal     %idle
18时05分05秒     all     11.59      0.00     38.10      0.21      0.00     50.10

18时05分02秒  active/s passive/s    iseg/s    oseg/s
18时05分05秒      3.67      0.00     15.00     19.33

18时05分05秒     CPU     %user     %nice   %system   %iowait    %steal     %idle
18时05分08秒     all      9.64      0.00     32.13      3.61      0.00     54.62

18时05分05秒  active/s passive/s    iseg/s    oseg/s
18时05分08秒      4.33      0.00     92.33     95.33

18时05分08秒     CPU     %user     %nice   %system   %iowait    %steal     %idle
18时05分11秒     all     10.82      0.00     48.12      0.22      0.00     40.84

18时05分08秒  active/s passive/s    iseg/s    oseg/s
18时05分11秒      0.33      0.00     13.67     14.33

Average:        CPU     %user     %nice   %system   %iowait    %steal     %idle
Average:        all     10.67      0.00     39.19      1.39      0.00     48.74

Average:     active/s passive/s    iseg/s    oseg/s
Average:         2.78      0.00     40.33     43.00
4.5 tracing工具
perfLinux官方分析工具,擅长CPU分析(采样分析)和PMC统计,也能分析其他event事件
ftraceLinux官方跟踪工具,可以不需要依赖运行(需要内核开启一些CONFIG)
BPFExtended BPF工具,BCC,bpftrace
system tapA high-level language and tracer with many tapsets (libraries) for tracing different targets. 工具stapbpf暂未研究
LTTngA tracer optimized for black-box recording: optimally recording many events for later analysis

perf用于CPU分析,ftrace用于内核代码跟踪,BCC/bpftrace用于其他任何地方(内存、文件系统、磁盘、网络以及应用程序追踪)

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值