bcc/ebpf使用介绍

本文介绍了Linux BCC工具集中的几个实用工具,包括memleak.py用于内存泄漏检测,cachetop.py用于缓存命中率统计,以及deadlock.py用于死锁检测。这些工具可以帮助开发者实时监控系统状态,快速定位问题。通过示例展示了如何使用这些工具以及它们的工作原理。
摘要由CSDN通过智能技术生成

1. bcc/ebpf介绍

ebpf是linux trace框架的一部分内容,trace的介绍可以参考linux tracers使用介绍。trace框架允许我们在内核态/用户态的代码中加钩子,并定义了一些预置的钩子函数,实现一些基本的调试功能。而对于需要比较灵活的处理的情况,可以使用ebpf,允许用户自定义钩子函数,进行例如信息的过滤、统计、计算等处理。

bcc是一个工具包,使用python来对ebpf进行封装,以便更加方便的使用ebpf,并内置了很多已经写好的工具,bcc的github地址是:https://github.com/iovisor/bcc
在这里插入图片描述
上图是bcc内置的工具,以及其分布的模块,包含block io的大小、耗时分析,内存的泄露检查,网络的统计等功能,具体可以参考github中的说明。

2. 编译安装

github中的INSTALL.md文件,介绍了编译与安装的过程,我使用的环境是UBUNTU 18.04.1:

Linux xxx 5.4.0-107-generic #121~18.04.1-Ubuntu SMP Thu Mar 24 17:21:33 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

以此为例介绍使用源码编译安装的过程。

2.1 安装依赖

首先安装一下编译和运行的依赖包,期间可能遇到一些错误,请自行百度解决:

# For Bionic (18.04 LTS)
sudo apt-get -y install bison build-essential cmake flex git libedit-dev \
  libllvm6.0 llvm-6.0-dev libclang-6.0-dev python zlib1g-dev libelf-dev libfl-dev python3-distutils

2.2 下载源码,编译安装bcc

然后下载源码,其中还有一些submodule,会在执行cmake时下载:

git clone https://github.com/iovisor/bcc.git
git checkout v0.24.0 # 在我使用的这个时间点,最新的代码在我的环境编译会有问题,需要checkout到这个tag上

然后进行bcc的编译:

mkdir bcc/build; cd bcc/build
cmake ..
make
sudo make install

cmake执行时会从github下载submodule,由于国内网络环境原因,可能会有概率下载失败,如果失败建议删除掉重新下载,否则编译时可能出现例如文件找不到的问题。

编译的产物如下,包含了bcc的头文件、lib库、内置的example和tools:

$ tree
.
├── include
│   └── bcc
│       ├── bcc_common.h
│       ├── bcc_elf.h
│       ├── bcc_exception.h
│       ├── bcc_proc.h
│       ├── bcc_syms.h
│       ├── bcc_usdt.h
│       ├── bcc_version.h
│       ├── BPF.h
│       ├── bpf_module.h
│       ├── BPFTable.h
│       ├── compat
│       │   └── linux
│       │       ├── bpf_common.h
│       │       ├── bpf.h
│       │       ├── btf.h
│       │       ├── if_link.h
│       │       ├── if_xdp.h
│       │       ├── netlink.h
│       │       ├── perf_event.h
│       │       ├── pkt_cls.h
│       │       └── pkt_sched.h
│       ├── file_desc.h
│       ├── libbpf.h
│       ├── perf_reader.h
│       ├── table_desc.h
│       └── table_storage.h
├── lib
│   └── x86_64-linux-gnu
│       ├── libbcc.a
│       ├── libbcc_bpf.a
│       ├── libbcc_bpf.so -> libbcc_bpf.so.0
│       ├── libbcc_bpf.so.0 -> libbcc_bpf.so.0.24.0
│       ├── libbcc_bpf.so.0.24.0
│       ├── libbcc-loader-static.a
│       ├── libbcc.so -> libbcc.so.0
│       ├── libbcc.so.0 -> libbcc.so.0.24.0
│       ├── libbcc.so.0.24.0
│       └── pkgconfig
│           └── libbcc.pc
└── share
    └── bcc
        ├── examples
        │   ├── hello_world.py
        │   ├── lua
        │   │   ├── bashreadline.c
        │   │   ├── ......
        │   ├── networking
        │   │   ├── distributed_bridge
        │   │   │   ├── main.py
        │   │   │   ├── simulation.py -> ../simulation.py
        │   │   │   ├── ......
        │   └── tracing
        │       ├── biolatpcts_example.txt
        │       ├── biolatpcts.py
        │       ├── ......
        ├── introspection
        │   └── bps
        ├── man
        │   └── man8
        │       ├── argdist.8.gz
        │       ├── bashreadline.8.gz
        │       ├── ......
        └── tools
            ├── argdist
            ├── bashreadline
            ├── ......

这些产物可以保留,相似平台上要使用的话,可以直接拷贝过去,免得重新编译。

2.3 编译安装python的module

还是在前面创建的build目录下,执行:

cmake -DPYTHON_CMD=python3 .. # build python3 binding
pushd src/python/
make
sudo make install
popd

可以看到产物如下,安装到了python的module放置的路径下:

Built target bcc_py_python3
Install the project...
-- Install configuration: "Release"
running install
running build
running build_py
running install_lib
copying build/lib/bcc/containers.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/tcp.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/__init__.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/syscall.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/perf.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/usdt.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/libbcc.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/version.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/table.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/utils.py -> /usr/lib/python3/dist-packages/bcc
copying build/lib/bcc/disassembler.py -> /usr/lib/python3/dist-packages/bcc
byte-compiling /usr/lib/python3/dist-packages/bcc/containers.py to containers.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/tcp.py to tcp.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/__init__.py to __init__.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/syscall.py to syscall.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/perf.py to perf.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/usdt.py to usdt.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/libbcc.py to libbcc.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/version.py to version.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/table.py to table.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/utils.py to utils.cpython-38.pyc
byte-compiling /usr/lib/python3/dist-packages/bcc/disassembler.py to disassembler.cpython-38.pyc
running install_egg_info
Removing /usr/lib/python3/dist-packages/bcc-0.24.0_8f40d6f5.egg-info
Writing /usr/lib/python3/dist-packages/bcc-0.24.0_8f40d6f5.egg-info

至此编译安装已经完成。

2.4 内核依赖

ubuntu系统一般是已经带上依赖了,如果需要自行编译内核的,记得加上以下配置:

In general, to use these features, a Linux kernel version 4.1 or newer is required. In addition, the kernel should have been compiled with the following flags set:

CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
# [optional, for tc filters]
CONFIG_NET_CLS_BPF=m
# [optional, for tc actions]
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
# [for Linux kernel versions 4.1 through 4.6]
CONFIG_HAVE_BPF_JIT=y
# [for Linux kernel versions 4.7 and later]
CONFIG_HAVE_EBPF_JIT=y
# [optional, for kprobes]
CONFIG_BPF_EVENTS=y
# Need kernel headers through /sys/kernel/kheaders.tar.xz
CONFIG_IKHEADERS=y

There are a few optional kernel flags needed for running bcc networking examples on vanilla kernel:

CONFIG_NET_SCH_SFQ=m
CONFIG_NET_ACT_POLICE=m
CONFIG_NET_ACT_GACT=m
CONFIG_DUMMY=m
CONFIG_VXLAN=m

Kernel compile flags can usually be checked by looking at /proc/config.gz or /boot/config-<kernel-version>.

3. 使用示例

这里我挑选了几个可能比较有用的内置工具介绍一下,当然每个人所需的工具不同,如果不想错过可能有用的工具的话,可以到github上简单浏览一下简要介绍。

3.1 内存泄露

内存泄露使用的是memleak.py文件,作用时统计从运行开始之后,内存的分配与释放情况,每隔一段时间,打印出分配了,但没有被释放的内存,信息包括分配的函数栈以及内存大小、个数。
相比于valgrind、asan等常用的内存泄露检测工具,可能有以下优势:

  • 不需要重新编译软件,不需要重启,安装bcc后运行即可检测。
  • 比较灵活,有一些内置参数,例如如果cpu比较吃紧,可以指定采样频率以减少开销。有额外需求也可以直接修改脚本进行定制。
  • 开销相对valgrind应该会小很多,asan不清楚。

使用说明和示例参考memleak_example.txt,这里简单贴一个打印结果:

# ./memleak -p $(pidof allocs) -a
Attaching to pid 5193, Ctrl+C to quit.
[11:16:33] Top 2 stacks with outstanding allocations:
        addr = 948cd0 size = 16
        addr = 948d10 size = 16
        addr = 948d30 size = 16
        addr = 948cf0 size = 16
        64 bytes in 4 allocations from stack
                 main+0x6d [allocs]
                 __libc_start_main+0xf0 [libc-2.21.so]

[11:16:34] Top 2 stacks with outstanding allocations:
        addr = 948d50 size = 16
        addr = 948cd0 size = 16
        addr = 948d10 size = 16
        addr = 948d30 size = 16
        addr = 948cf0 size = 16
        addr = 948dd0 size = 16
        addr = 948d90 size = 16
        addr = 948db0 size = 16
        addr = 948d70 size = 16
        addr = 948df0 size = 16
        160 bytes in 10 allocations from stack
                 main+0x6d [allocs]
                 __libc_start_main+0xf0 [libc-2.21.so]

需要注意的是,这个脚本统计的是分配了但是没有释放的内存,可以用来作为判断内存泄露的参考信息,而不是说一定是泄露了的内存。

这个脚本可以统计用户态和内核态的内存,统计用户态的内存,原理是在malloc、calloc、posix_memalign等分配内存的函数中加uprobe,挂上脚本中定义的统计函数进行统计。

另外,可能会遇到bpf_probe_read_user的报错,将脚本中的这个函数改为bpf_probe_read可能可以解决。

3.2 缓存命中率统计

使用cachetop.py文件,可以统计每个进程的缓存命中率:

# ./cachetop 5
13:01:01 Buffers MB: 76 / Cached MB: 114 / Sort: HITS / Order: ascending
PID      UID      CMD              HITS     MISSES   DIRTIES  READ_HIT%  WRITE_HIT%
       1 root     systemd                 2        0        0     100.0%       0.0%
     680 root     vminfo                  3        4        2      14.3%      42.9%
     567 syslog   rs:main Q:Reg          10        4        2      57.1%      21.4%
     986 root     kworker/u2:2           10     2457        4       0.2%      99.5%
     988 root     kworker/u2:2           10        9        4      31.6%      36.8%
     877 vagrant  systemd                18        4        2      72.7%      13.6%
     983 root     python                148        3      143       3.3%       1.3%
     981 root     strace                419        3      143      65.4%       0.5%
     544 messageb dbus-daemon           455      371      454       0.1%       0.4%
     243 root     jbd2/dm-0-8           457      371      454       0.4%       0.4%
     985 root     (mount)               560     2457        4      18.4%      81.4%
     987 root     systemd-udevd         566        9        4      97.7%       1.2%
     988 root     systemd-cgroups       569        9        4      97.8%       1.2%
     986 root     modprobe              578        9        4      97.8%       1.2%
     287 root     systemd-journal       598      371      454      14.9%       0.3%
     985 root     mount                 692     2457        4      21.8%      78.0%
     984 vagrant  find                 9529     2457        4      79.5%      20.5%

原理是在内核cache相关的几个函数上加了kprobe进行统计:

    b.attach_kprobe(event="add_to_page_cache_lru", fn_name="do_count")
    b.attach_kprobe(event="mark_page_accessed", fn_name="do_count")
    b.attach_kprobe(event="mark_buffer_dirty", fn_name="do_count")

    # Function account_page_dirtied() is changed to folio_account_dirtied() in 5.15.
    if BPF.get_kprobe_functions(b'folio_account_dirtied'):
        b.attach_kprobe(event="folio_account_dirtied", fn_name="do_count")
    elif BPF.get_kprobe_functions(b'account_page_dirtied'):
        b.attach_kprobe(event="account_page_dirtied", fn_name="do_count")

3.3 死锁检测

死锁检测使用脚本deadlock.py,原理是表示锁关系的图,每个锁是一个点,一个线程按照顺序获取锁A和锁B,在图中产生一条A->B的边。如果图中出现了环,就表示可能存在死锁。

示例:

# ./deadlock.py 181
Tracing... Hit Ctrl-C to end.
----------------
Potential Deadlock Detected!

Cycle in lock order graph: Mutex M0 (main::static_mutex3 0x0000000000473c60) => Mutex M1 (0x00007fff6d738400) => Mutex M2 (global_mutex1 0x0000000000473be0) => Mutex M3 (global_mutex2 0x0000000000473c20) => Mutex M0 (main::static_mutex3 0x0000000000473c60)

Mutex M1 (0x00007fff6d738400) acquired here while holding Mutex M0 (main::static_mutex3 0x0000000000473c60) in Thread 357250 (lockinversion):
@ 00000000004024d0 pthread_mutex_lock
@ 0000000000406dd0 std::mutex::lock()
@ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
@ 0000000000402e38 main::{lambda()#3}::operator()() const
@ 0000000000406ba8 void std::_Bind_simple<main::{lambda()#3} ()>::_M_invoke<>(std::_Index_tuple<>)
@ 0000000000406951 std::_Bind_simple<main::{lambda()#3} ()>::operator()()
@ 000000000040673a std::thread::_Impl<std::_Bind_simple<main::{lambda()#3} ()> >::_M_run()
@ 00007fd4496564e1 execute_native_thread_routine
@ 00007fd449dd57f1 start_thread
@ 00007fd44909746d __clone

Mutex M0 (main::static_mutex3 0x0000000000473c60) previously acquired by the same Thread 357250 (lockinversion) here:
@ 00000000004024d0 pthread_mutex_lock
@ 0000000000406dd0 std::mutex::lock()
@ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
@ 0000000000402e22 main::{lambda()#3}::operator()() const
@ 0000000000406ba8 void std::_Bind_simple<main::{lambda()#3} ()>::_M_invoke<>(std::_Index_tuple<>)
@ 0000000000406951 std::_Bind_simple<main::{lambda()#3} ()>::operator()()
@ 000000000040673a std::thread::_Impl<std::_Bind_simple<main::{lambda()#3} ()> >::_M_run()
@ 00007fd4496564e1 execute_native_thread_routine
@ 00007fd449dd57f1 start_thread
@ 00007fd44909746d __clone

Mutex M2 (global_mutex1 0x0000000000473be0) acquired here while holding Mutex M1 (0x00007fff6d738400) in Thread 357251 (lockinversion):
@ 00000000004024d0 pthread_mutex_lock
@ 0000000000406dd0 std::mutex::lock()
@ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
@ 0000000000402ea8 main::{lambda()#4}::operator()() const
@ 0000000000406b46 void std::_Bind_simple<main::{lambda()#4} ()>::_M_invoke<>(std::_Index_tuple<>)
@ 000000000040692d std::_Bind_simple<main::{lambda()#4} ()>::operator()()
@ 000000000040671c std::thread::_Impl<std::_Bind_simple<main::{lambda()#4} ()> >::_M_run()
@ 00007fd4496564e1 execute_native_thread_routine
@ 00007fd449dd57f1 start_thread
@ 00007fd44909746d __clone

Mutex M1 (0x00007fff6d738400) previously acquired by the same Thread 357251 (lockinversion) here:
@ 00000000004024d0 pthread_mutex_lock
@ 0000000000406dd0 std::mutex::lock()
@ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
@ 0000000000402e97 main::{lambda()#4}::operator()() const
@ 0000000000406b46 void std::_Bind_simple<main::{lambda()#4} ()>::_M_invoke<>(std::_Index_tuple<>)
@ 000000000040692d std::_Bind_simple<main::{lambda()#4} ()>::operator()()
@ 000000000040671c std::thread::_Impl<std::_Bind_simple<main::{lambda()#4} ()> >::_M_run()
@ 00007fd4496564e1 execute_native_thread_routine
@ 00007fd449dd57f1 start_thread
@ 00007fd44909746d __clone

Mutex M3 (global_mutex2 0x0000000000473c20) acquired here while holding Mutex M2 (global_mutex1 0x0000000000473be0) in Thread 357247 (lockinversion):
@ 00000000004024d0 pthread_mutex_lock
@ 0000000000406dd0 std::mutex::lock()
@ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
@ 0000000000402d5f main::{lambda()#1}::operator()() const
@ 0000000000406c6c void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>)
@ 0000000000406999 std::_Bind_simple<main::{lambda()#1} ()>::operator()()
@ 0000000000406776 std::thread::_Impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run()
@ 00007fd4496564e1 execute_native_thread_routine
@ 00007fd449dd57f1 start_thread
@ 00007fd44909746d __clone

Mutex M2 (global_mutex1 0x0000000000473be0) previously acquired by the same Thread 357247 (lockinversion) here:
@ 00000000004024d0 pthread_mutex_lock
@ 0000000000406dd0 std::mutex::lock()
@ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
@ 0000000000402d4e main::{lambda()#1}::operator()() const
@ 0000000000406c6c void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>)
@ 0000000000406999 std::_Bind_simple<main::{lambda()#1} ()>::operator()()
@ 0000000000406776 std::thread::_Impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run()
@ 00007fd4496564e1 execute_native_thread_routine
@ 00007fd449dd57f1 start_thread
@ 00007fd44909746d __clone

Mutex M0 (main::static_mutex3 0x0000000000473c60) acquired here while holding Mutex M3 (global_mutex2 0x0000000000473c20) in Thread 357248 (lockinversion):
@ 00000000004024d0 pthread_mutex_lock
@ 0000000000406dd0 std::mutex::lock()
@ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
@ 0000000000402dc9 main::{lambda()#2}::operator()() const
@ 0000000000406c0a void std::_Bind_simple<main::{lambda()#2} ()>::_M_invoke<>(std::_Index_tuple<>)
@ 0000000000406975 std::_Bind_simple<main::{lambda()#2} ()>::operator()()
@ 0000000000406758 std::thread::_Impl<std::_Bind_simple<main::{lambda()#2} ()> >::_M_run()
@ 00007fd4496564e1 execute_native_thread_routine
@ 00007fd449dd57f1 start_thread
@ 00007fd44909746d __clone

Mutex M3 (global_mutex2 0x0000000000473c20) previously acquired by the same Thread 357248 (lockinversion) here:
@ 00000000004024d0 pthread_mutex_lock
@ 0000000000406dd0 std::mutex::lock()
@ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
@ 0000000000402db8 main::{lambda()#2}::operator()() const
@ 0000000000406c0a void std::_Bind_simple<main::{lambda()#2} ()>::_M_invoke<>(std::_Index_tuple<>)
@ 0000000000406975 std::_Bind_simple<main::{lambda()#2} ()>::operator()()
@ 0000000000406758 std::thread::_Impl<std::_Bind_simple<main::{lambda()#2} ()> >::_M_run()
@ 00007fd4496564e1 execute_native_thread_routine
@ 00007fd449dd57f1 start_thread
@ 00007fd44909746d __clone

Thread 357248 created by Thread 350692 (lockinversion) here:
@ 00007fd449097431 __clone
@ 00007fd449dd5ef5 pthread_create
@ 00007fd449658440 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>)
@ 00000000004033ac std::thread::thread<main::{lambda()#2}>(main::{lambda()#2}&&)
@ 000000000040308f main
@ 00007fd448faa0f6 __libc_start_main
@ 0000000000402ad8 [unknown]

Thread 357250 created by Thread 350692 (lockinversion) here:
@ 00007fd449097431 __clone
@ 00007fd449dd5ef5 pthread_create
@ 00007fd449658440 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>)
@ 00000000004034b2 std::thread::thread<main::{lambda()#3}>(main::{lambda()#3}&&)
@ 00000000004030b9 main
@ 00007fd448faa0f6 __libc_start_main
@ 0000000000402ad8 [unknown]

Thread 357251 created by Thread 350692 (lockinversion) here:
@ 00007fd449097431 __clone
@ 00007fd449dd5ef5 pthread_create
@ 00007fd449658440 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>)
@ 00000000004035b8 std::thread::thread<main::{lambda()#4}>(main::{lambda()#4}&&)
@ 00000000004030e6 main
@ 00007fd448faa0f6 __libc_start_main
@ 0000000000402ad8 [unknown]

Thread 357247 created by Thread 350692 (lockinversion) here:
@ 00007fd449097431 __clone
@ 00007fd449dd5ef5 pthread_create
@ 00007fd449658440 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>)
@ 00000000004032a6 std::thread::thread<main::{lambda()#1}>(main::{lambda()#1}&&)
@ 0000000000403070 main
@ 00007fd448faa0f6 __libc_start_main
@ 0000000000402ad8 [unknown]

This is output from a process that has a potential deadlock involving 4 mutexes
and 4 threads:

- Thread 357250 acquired M1 while holding M0 (edge M0 -> M1)
- Thread 357251 acquired M2 while holding M1 (edge M1 -> M2)
- Thread 357247 acquired M3 while holding M2 (edge M2 -> M3)
- Thread 357248 acquired M0 while holding M3 (edge M3 -> M0)

这个死锁涉及到4个线程,不使用工具去分析还是比较复杂的,但是看最后打印出来的关系就比较清楚了。根据前面打印的栈信息,也很容易能找到代码。

但是这个工具需要在死锁出现之前就运行起来挂上,出现死锁之后再使用这个工具是没有用的。而且由于需要画图、判断环,如果锁和线程比较多,开销可能会比较大。

3.4 bio

bio相关有几个工具:

  • biolatency 统计bio的耗时分布情况
# ./biolatency
Tracing block device I/O... Hit Ctrl-C to end.
^C
     usecs           : count     distribution
       0 -> 1        : 0        |                                      |
       2 -> 3        : 0        |                                      |
       4 -> 7        : 0        |                                      |
       8 -> 15       : 0        |                                      |
      16 -> 31       : 0        |                                      |
      32 -> 63       : 0        |                                      |
      64 -> 127      : 1        |                                      |
     128 -> 255      : 12       |********                              |
     256 -> 511      : 15       |**********                            |
     512 -> 1023     : 43       |*******************************       |
    1024 -> 2047     : 52       |**************************************|
    2048 -> 4095     : 47       |**********************************    |
    4096 -> 8191     : 52       |**************************************|
    8192 -> 16383    : 36       |**************************            |
   16384 -> 32767    : 15       |**********                            |
   32768 -> 65535    : 2        |*                                     |
   65536 -> 131071   : 2        |*                                     |
  • biotop 统计每个进程的bio数据量大小
# ./biotop
Tracing... Output every 1 secs. Hit Ctrl-C to end

08:04:11 loadavg: 1.48 0.87 0.45 1/287 14547

PID    COMM             D MAJ MIN DISK       I/O  Kbytes  AVGms
14501  cksum            R 202 1   xvda1      361   28832   3.39
6961   dd               R 202 1   xvda1     1628   13024   0.59
13855  dd               R 202 1   xvda1     1627   13016   0.59
326    jbd2/xvda1-8     W 202 1   xvda1        3     168   3.00
1880   supervise        W 202 1   xvda1        2       8   6.71
1873   supervise        W 202 1   xvda1        2       8   2.51
1871   supervise        W 202 1   xvda1        2       8   1.57
1876   supervise        W 202 1   xvda1        2       8   1.22
1892   supervise        W 202 1   xvda1        2       8   0.62
1878   supervise        W 202 1   xvda1        2       8   0.78
1886   supervise        W 202 1   xvda1        2       8   1.30
1894   supervise        W 202 1   xvda1        2       8   3.46
1869   supervise        W 202 1   xvda1        2       8   0.73
1888   supervise        W 202 1   xvda1        2       8   1.48
  • biopattern 统计随机io和顺序io的比例,原理应该是判断前后两次io是否联系
# ./biopattern.py
TIME      DISK     %RND  %SEQ    COUNT     KBYTES
22:03:51  vdb         0    99      788       3184
22:03:51  Unknown     0   100        4          0
22:03:51  vda        85    14       21        488
[...]
  • biosnoop 统计每个bio的进程、大小、耗时等信息,
# ./biosnoop
TIME(s)     COMM           PID    DISK    T SECTOR     BYTES   LAT(ms)
0.000004    supervise      1950   xvda1   W 13092560   4096       0.74
0.000178    supervise      1950   xvda1   W 13092432   4096       0.61
0.001469    supervise      1956   xvda1   W 13092440   4096       1.24
0.001588    supervise      1956   xvda1   W 13115128   4096       1.09
1.022346    supervise      1950   xvda1   W 13115272   4096       0.98
1.022568    supervise      1950   xvda1   W 13188496   4096       0.93
1.023534    supervise      1956   xvda1   W 13188520   4096       0.79
1.023585    supervise      1956   xvda1   W 13189512   4096       0.60
2.003920    xfsaild/md0    456    xvdc    W 62901512   8192       0.23
2.003931    xfsaild/md0    456    xvdb    W 62901513   512        0.25
2.004034    xfsaild/md0    456    xvdb    W 62901520   8192       0.35
2.004042    xfsaild/md0    456    xvdb    W 63542016   4096       0.36
2.004204    kworker/0:3    26040  xvdb    W 41950344   65536      0.34
2.044352    supervise      1950   xvda1   W 13192672   4096       0.65
2.044574    supervise      1950   xvda1   W 13189072   4096       0.58
  • bitesize 统计每个进程的bio大小的分布情况
# ./bitesize.py
Tracing... Hit Ctrl-C to end.
^C

Process Name = 'kworker/u128:1'
    Kbytes              : count     distribution
        0 -> 1          : 1        |********************                    |
        2 -> 3          : 0        |                                        |
        4 -> 7          : 2        |****************************************|

Process Name = 'bitesize.py'
    Kbytes              : count     distribution
        0 -> 1          : 0        |                                        |
        2 -> 3          : 0        |                                        |
        4 -> 7          : 0        |                                        |
        8 -> 15         : 0        |                                        |
       16 -> 31         : 0        |                                        |
       32 -> 63         : 0        |                                        |
       64 -> 127        : 0        |                                        |
      128 -> 255        : 1        |****************************************|

Process Name = 'dd'
    Kbytes              : count     distribution
        0 -> 1          : 3        |                                        |
        2 -> 3          : 0        |                                        |
        4 -> 7          : 6        |                                        |
        8 -> 15         : 0        |                                        |
       16 -> 31         : 1        |                                        |
       32 -> 63         : 1        |                                        |
       64 -> 127        : 0        |                                        |
      128 -> 255        : 0        |                                        |
      256 -> 511        : 1        |                                        |
      512 -> 1023       : 0        |                                        |
     1024 -> 2047       : 488      |****************************************|

3.5 bio火焰图

这个不是bcc内置的工具,参考https://www.brendangregg.com/FlameGraphs/offcpuflamegraphs.html#IO

脚本是使用blk_account_io_start和blk_account_io_completion这两个函数上添加kprobe来进行数据获取的,分别是block io的开始与结束,这个函数名可能在不同版本的内核中会发生变化,因此可能需要对脚本做一些修改,例如我使用的版本上结束的函数名是blk_account_io_done。

使用方法参考链接,结果是这样的:
在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值