一款分析linux上应用程序分析的工具----0x.tools

0x.tools是一个轻量级的开源工具,用于分析Linux应用程序性能,无需升级系统或安装额外组件。通过简单的`git clone`、`make`和`make install`即可部署。示例展示了如何使用`xcapture`进行进程状态采样和写入CSV,以及如何使用`psn`进行系统调用和等待通道的详细查看。这些工具帮助用户深入理解进程的运行状态,尤其是对于数据库如MySQL的性能瓶颈定位。
摘要由CSDN通过智能技术生成

0x.tools是一个用于分析Linux上应用程序性能的开源实用程序。部署简单且依赖性小。无需升级操作系统、安装内核模块、繁重的监控框架、Java代理或数据库。这款工具还适用于十多年前的Linux内核,如14年前的2.6.18版本。

包含的工具:
在这里插入图片描述

安装和使用:

$ git clone https://github.com/tanelpoder/0xtools
$ make
$ make install

具体使用示例请参考:
https://0x.tools/#installation–usage

$ xcapture

0xTools xcapture v1.0 by Tanel Poder [https://0x.tools]

Sampling /proc...

DATE       TIME             PID     TID USERNAME        ST COMMAND                   SYSCALL                   WCHAN                    
2020-10-17 12:01:50.583    6404    7524 mysql           R  (mysqld)                  fsync                     wait_on_page_bit          
2020-10-17 12:01:50.583    6404    8944 mysql           D  (mysqld)                  fsync                     wait_on_page_bit          
2020-10-17 12:01:50.583    6404    8946 mysql           D  (mysqld)                  fsync                     wait_on_page_bit          
2020-10-17 12:01:50.583    6404   76046 mysql           D  (mysqld)                  fsync                     wait_on_page_bit          
2020-10-17 12:01:50.583    6404   76811 mysql           D  (mysqld)                  fdatasync                 xfs_log_force_lsn         
2020-10-17 12:01:50.583    6404   76815 mysql           D  (mysqld)                  fsync                     blkdev_issue_flush        
2020-10-17 12:01:50.583    8803    8803 root            R  (md10_resync)             [running]                 0   

采样所有线程状态并写到csv文件

$ xcapture -a -o /data/xcap &

$ head 2020-10-16.21.csv
TS,PID,TID,USERNAME,ST,COMMAND,SYSCALL,WCHAN,EXE,CMDLINE,KSTACK
2020-10-16 21:00:00.001,5335,5335,root,R,(collectl),[running],0,perl,/usr/bin/perl,
2020-10-16 21:00:00.001,8803,8803,root,D,(md10_resync),[no_syscall],msleep,-,-,->ret_from_fork_nospec_begin()->kthread()->md_thread()->md_do_sync()->msleep()
2020-10-16 21:00:01.038,8803,8803,root,R,(md10_resync),[no_syscall],md_do_sync,-,-,->ret_from_fork_nospec_begin()->kthread()->md_thread()->md_do_sync()
2020-10-16 21:00:02.075,8803,8803,root,D,(md10_resync),[no_syscall],md_do_sync,-,-,->ret_from_fork_nospec_begin()->kthread()->md_thread()->md_do_sync()
2020-10-16 21:00:02.075,16762,16762,oracle,R,(ora_m000_lin19c),[running],0,oracle,ora_m000_LIN19C,->do_blockdev_direct_IO()->dio_complete()
2020-10-16 21:00:03.112,8803,8803,root,R,(md10_resync),[no_syscall],md_do_sync,-,-,->ret_from_fork_nospec_begin()->kthread()->md_thread()->md_do_sync()
2020-10-16 21:00:04.149,8803,8803,root,D,(md10_resync),[no_syscall],msleep,-,-,->ret_from_fork_nospec_begin()->kthread()->md_thread()->md_do_sync()->msleep()
2020-10-16 21:00:05.186,8803,8803,root,D,(md10_resync),[no_syscall],md_do_sync,-,-,->ret_from_fork_nospec_begin()->kthread()->md_thread()->md_do_sync()
2020-10-16 21:00:05.186,65913,65913,oracle,D,(ora_ckpt_lin122),pwrite64,blkdev_issue_flush,oracle,ora_ckpt_LIN122,->system_call_fastpath()->SyS_pwrite64()->vfs_write()->do_sync_write()->xfs_file_aio_write()->generic_write_sync()->xfs_file_fsync()->xfs_blkdev_issue_flush()->blkdev_issue_flush()

使用linux文本处理工具过滤查看

$ cat 2020-10-13.01.csv | awk -F, '{ printf("%2s %-20s %-20s %s\n",$5,$4,$7,$10) }' | sort | uniq -c | sort -nbr | head -20
   2303  D root                 read                 -
   1761  R tanel                [running]            stress
   1384  D postgres             pread64              postgres: tanel pgbench [local] UPDATE
    894  R root                 [running]            -
    229  R root                 read                 -
    229  D mysql                fsync                /usr/sbin/mysqld
    144  R tanel                [running]            -
    115  - -                    -                    -
    110  D oracle               io_submit            ora_ckpt_LINPRD
    101  D root                 [running]            -
     73  D root                 read                 dd
     58  R root                 [running]            /opt/oracle.ahf/jre/bin/java
     55  R mysql                [running]            /usr/sbin/mysqld
     52  D tanel                [no_syscall]         stress
     51  R oracle               [running]            oracleLIN19C
     50  R root                 [running]            dd
     35  R oracle               [running]            xe_mz01_XE
     32  R tanel                [running]            /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.262.b10-0.el7_8.x86_64/jre/bin/java
     29  R oracle               [running]            pidstat
     27  D oracle               pread64              oracleLIN19C

xcapture使用:

$ xcapture

0x.Tools xcapture v1.0 by Tanel Poder [https://0x.tools]

Usage:
  xcapture [options]

  By default, sample all /proc tasks in states R, D every second and print to stdout

  Options:
    -a             capture tasks in additional states, even the ones Sleeping (S)
    -A             capture tasks in All states, including Zombie (Z), Exiting (X), Idle (I)
    -c <c1,c2>     print additional columns (for example: -c exe,cmdline,kstack)
    -d <N>         seconds to sleep between samples (default: 1)
    -E <string>    custom task state Exclusion filter (default: XZIS)
    -h             display this help message
    -o <dirname>   write wide output into hourly CSV files in this directory instead of stdout


$ xcapture -c exe,kstack
$ xcapture -o .
$ xcapture -o /data/perf_archive/xcap

psn是使用python编写的脚本,可以查看当前相关状态

$ psn -p 18286 -G syscall,filename

Linux Process Snapper v0.14 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, syscall for 5 seconds... finished.


=== Active Threads ==================================================================================

 samples | avg_threads | comm | state                  | syscall   | filename                        
-----------------------------------------------------------------------------------------------------
      79 |        0.79 | (dd) | Disk (Uninterruptible) | write     | /backup/tanel/test (stdout)
       7 |        0.07 | (dd) | Disk (Uninterruptible) | [running] |                                 
       5 |        0.05 | (dd) | Running (ON CPU)       | write     | /backup/tanel/test (stdout)     
       4 |        0.04 | (dd) | Disk (Uninterruptible) | read      | /reco/fio/mmapfile.0.0 (stdin)  
       3 |        0.03 | (dd) | Running (ON CPU)       | [running] |                                 
       2 |        0.02 | (dd) | Running (ON CPU)       | read      | /reco/fio/mmapfile.0.0 (stdin)  ```
$ sudo psn -p "mysqld|kwork" -G syscall,wchan

Linux Process Snapper v0.14 by Tanel Poder [https://0x.tools]
Sampling /proc/syscall, stat, wchan for 5 seconds... finished.


=== Active Threads ========================================================================================

 samples | avg_threads | comm          | state                  | syscall   | wchan                        
-----------------------------------------------------------------------------------------------------------
      25 |        3.12 | (mysqld)      | Disk (Uninterruptible) | fsync     | _xfs_log_force_lsn
      16 |        2.00 | (mysqld)      | Running (ON CPU)       | [running] | 0                            
      14 |        1.75 | (mysqld)      | Disk (Uninterruptible) | pwrite64  | call_rwsem_down_write_failed
       8 |        1.00 | (mysqld)      | Disk (Uninterruptible) | fsync     | submit_bio_wait              
       4 |        0.50 | (mysqld)      | Disk (Uninterruptible) | pread64   | io_schedule                  
       4 |        0.50 | (mysqld)      | Disk (Uninterruptible) | pwrite64  | io_schedule                  
       3 |        0.38 | (mysqld)      | Disk (Uninterruptible) | pread64   | 0                            
       3 |        0.38 | (mysqld)      | Running (ON CPU)       | [running] | io_schedule                  
       3 |        0.38 | (mysqld)      | Running (ON CPU)       | pread64   | 0                            
       2 |        0.25 | (mysqld)      | Disk (Uninterruptible) | [running] | 0                            
       1 |        0.12 | (kworker/*:*) | Running (ON CPU)       | read      | worker_thread                
       1 |        0.12 | (mysqld)      | Disk (Uninterruptible) | fsync     | io_schedule                  
       1 |        0.12 | (mysqld)      | Disk (Uninterruptible) | futex     | call_rwsem_down_write_failed 
       1 |        0.12 | (mysqld)      | Disk (Uninterruptible) | poll      | 0                            
       1 |        0.12 | (mysqld)      | Disk (Uninterruptible) | pwrite64  | _xfs_log_force_lsn           
       1 |        0.12 | (mysqld)      | Running (ON CPU)       | fsync     | submit_bio_wait              
       1 |        0.12 | (mysqld)      | Running (ON CPU)       | futex     | futex_wait_queue_me     

更多相关示例可以参考Linux性能诊断

SchedLat

$ ./schedlat.py 29801
SchedLat by Tanel Poder [https://0x.tools]

PID=29801 COMM=oracle_29801_li

TIMESTAMP              %CPU   %LAT   %SLP
2020-02-26 23:17:35   100.0    0.0    0.0   <<-- no CPU shortage, process 100% on CPU
2020-02-26 23:17:36   100.0    0.0    0.0
2020-02-26 23:17:37   100.0    0.0    0.0
2020-02-26 23:17:38   100.0    0.0    0.0   <<-- %SLP = 100-(%CPU+%LAT), when Linux reports slightly
2020-02-26 23:17:39    98.0    0.0    2.0        more than "100%" of CPU+LAT, then the derived
2020-02-26 23:17:40     0.0    0.0  100.0        "remaining time" SLP% may show a negative value
2020-02-26 23:17:41     0.0    0.0  100.0
2020-02-26 23:17:42     0.0    0.0  100.0   <<-- no CPU shortage, process sleeping
2020-02-26 23:17:43     0.4    0.0   99.6
2020-02-26 23:17:44    33.5    0.2   66.3   <<-- no CPU shortage, process doing synchronous I/Os 
2020-02-26 23:17:45    55.5    0.2   44.2        in a loop (thus taken off CPU frequently by scheduler)
2020-02-26 23:17:46    53.9    0.2   45.9
2020-02-26 23:17:47    54.5    0.2   45.3
2020-02-26 23:17:48    59.1    0.2   40.7
2020-02-26 23:17:49     4.4    0.0   95.6
2020-02-26 23:17:50    58.5    0.1   41.4
2020-02-26 23:17:51    95.7    0.0    4.3
2020-02-26 23:17:52     0.3    0.0   99.7 
2020-02-26 23:17:53     0.1    0.0   99.9
2020-02-26 23:17:54     0.1    0.0   99.9
2020-02-26 23:17:55     0.3    1.1   98.6
2020-02-26 23:17:56     0.1    6.0   93.9
2020-02-26 23:17:57     0.1   15.0   84.9
2020-02-26 23:17:58     0.1   13.8   86.1
2020-02-26 23:17:59     9.6   61.4   29.0   <<-- CPU shortage + process doing synchronous I/Os in a loop
2020-02-26 23:18:00    14.6   83.9    1.5   <<-- and spending more time in CPU runqueue after every I/O
2020-02-26 23:18:01    31.4   59.7    8.9
2020-02-26 23:18:02    13.0   13.9   73.1
2020-02-26 23:18:03     0.3    5.3   94.4

更多示例请参考度量cpu调度延迟

CPU profiling
实际使用perf record命令生成相关数据

#cat run_xcpu.sh

FREQUENCY=1 # 1 Hz sampling
SUDO=sudo # change to empty string if running without sudo
NICE=-5
SLEEP=60
PERF=/usr/bin/perf

if [ $# -ne 1 ]; then
  echo "Usage: $0 output_dir"
  exit 1
fi

logger "$0 Starting up outdir=$1 nice=$NICE"

while true ; do
    $SUDO nice -n $NICE $PERF record -g -F $FREQUENCY -a \
                --switch-output=1m \
                --timestamp-filename \
                --timestamp \
                -o $1/xcpu
    
    # we only get here should perf be terminated, try to restart
    logger "$0 terminated with $?, attempting to restart in $SLEEP seconds"
    sleep $SLEEP
done

使用以上命令进行cpu栈跟踪,然后通过以下命令查看

$ perf report -s sym,dso -i xcpu.2020101619323791
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值