磁盘IO读写的监控有一些实用的工具,做个总结
1:iotop
顾名思义,top前面加了一个io;安装起来也很方面,直接装就是了,运行也简单
~# iotop -o
Total DISK READ: 0.00 B/s | Total DISK WRITE: 664.62 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
8410 be/4 root 0.00 B/s 0.00 B/s 0.00 % 77.37 % dd if=/dev/zero of=/root/1Gb.file bs=1M count=1000
1998 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.15 % [flush-254:0]
2230 be/4 root 0.00 B/s 456.74 K/s 0.00 % 0.00 % rsyslogd -c5
-o只列出有IO的进程
2:iostat
debian7上,这玩意还比较难找,用apt-cache search可以找到,应该安装的是这个
sysstat - system performance tools for Linux
这里打印IO的同时,device也打印了出来
~# iostat -d -m 1 5
Linux 3.2.0-4-amd64 (haitao-47) 11/07/2015 _x86_64_(32 CPU)
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
vda 122.72 0.00 60.25 359 7841638
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
vda 359.00 0.00 178.71 0 178
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
vda 619.00 0.00 308.50 0 308
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
vda 473.00 0.00 236.50 0 236
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
vda 805.00 0.00 402.50 0 402
3:iodump
这是一个用perl写的脚本,输出有点非主流;首先系统准备工作,清除dmesg的信息,关闭klogd,打开消息开关;当这个block开关打开之后,内核就会记录下每一个IO的操作信息,最后perl脚本里再来进行分析
#!/usr/bin/env perl
=pod
=head1 NAME
iodump - Compute per-PID I/O stats for Linux when iotop/pidstat/iopp are not available.
=head1 SYNOPSIS
Prepare the system:
dmesg -c
/etc/init.d/klogd stop
echo 1 > /proc/sys/vm/block_dump
Start the reporting:
while true; do sleep 1; dmesg -c; done | perl iodump
CTRL-C
Stop the system from dumping these messages:
echo 0 > /proc/sys/vm/block_dump
/etc/init.d/klogd start
=head1 AUTHOR
Baron Schwartz, inspired by L
=head1 LICENSE
This software is released to the public domain, with no guarantees whatsoever.
=cut
use strict;
use warnings FATAL => 'all';
use English qw(-no_match_vars);
use sigtrap qw(handler finish untrapped normal-signals);
my %tasks;
my $oktorun = 1;
my $line;
while ( $oktorun && (defined ($line = <>)) ) {
my ( $task, $pid, $activity, $where, $device );
( $task, $pid, $activity, $where, $device )
= $line =~ m/(\S+)\((\d+)\): (READ|WRITE) block (\d+) on (\S+)/;
if ( !$task ) {
( $task, $pid, $activity, $where, $device )
= $line =~ m/(\S+)\((\d+)\): (dirtied) inode \(.*?\) (\d+) on (\S+)/;
}
if ( $task ) {
my $s = $tasks{$pid} ||= { pid => $pid, task => $task };
++$s->{lc $activity};
++$s->{activity};
++$s->{devices}->{$device};
}
}
printf("%-15s %10s %10s %10s %10s %10s %s\n",
qw(TASK PID TOTAL READ WRITE DIRTY DEVICES));
foreach my $task (
reverse sort { $a->{activity} <=> $b->{activity} } values %tasks
) {
printf("%-15s %10d %10d %10d %10d %10d %s\n",
$task->{task}, $task->{pid},
($task->{'activity'} || 0),
($task->{'read'} || 0),
($task->{'write'} || 0),
($task->{'dirty'} || 0),
join(', ', keys %{$task->{devices}}));
}
sub finish {
my ( $signal ) = @_;
if ( $oktorun ) {
print STDERR "# Caught SIG$signal.\n";
$oktorun = 0;
}
else {
print STDERR "# Exiting on SIG$signal.\n";
exit(1);
}
}
但是最终通过扑捉信号,不过既然这个很多人都强烈推荐,那就按官方要求运行吧
while true; do sleep 1; dmesg -c; done | perl iodump.pl
最终通过ctrl+c获得结果
~# while true; do sleep 1; dmesg -c; done | perl iodump.pl
^C# Caught SIGINT.
TASK PID TOTAL READ WRITE DIRTY DEVICES
dd 10679 2001 0 2001 0 vda1
dd 10550 2001 0 2001 0 vda1
dd 10711 2001 0 2001 0 vda1
dd 10560 2001 0 2001 0 vda1
dd 10567 2001 0 2001 0 vda1
dd 10655 2001 0 2001 0 vda1
dd 10625 2001 0 2001 0 vda1
dd 10645 2001 0 2001 0 vda1
dd 10634 2001 0 2001 0 vda1
dd 10548 1428 0 1428 0 vda1
flush-254:0 1998 613 0 613 0 vda1
dd 10731 602 0 602 0 vda1
jbd2/vda1-8 370 331 0 331 0 vda1
sendmail 10595 2 0 2 0 vda1
exim4 10614 1 0 1 0 vda1
python 10605 1 0 1 0 vda1
exim4 10616 1 0 1 0 vda1
但是这里打印数据的单位是block,这里block块大小取决于创建文件系统的时候,可以通过命令查看
~# stat /boot
File: `/boot'
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fe01h/65025dInode: 652812 Links: 3
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2015-11-06 17:49:27.866159150 +0800
Modify: 2015-11-07 14:20:21.374693644 +0800
Change: 2015-11-07 14:20:21.374693644 +0800
Birth: -
4:dstat
这一个应该是最爽的,有时候看网络流量也行,安装很简单,直接装就行,运行也简单
~# dstat
You did not select any stats, using -cdngy by default.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
0 1 98 0 0 0|2862B 63M| 0 0 | 0 0 | 11k 17k
0 1 97 2 0 0| 0 371M| 652M 965k| 0 0 | 10k 15k
0 1 97 1 0 0| 0 114M| 668M 1011k| 0 0 |9942 16k
0 1 98 1 0 0| 0 115M| 680M 1036k| 0 0 |9914 17k
0 2 97 1 0 0| 0 152M| 652M 979k| 0 0 | 10k 16k
0 2 97 0 0 1| 0 0 | 701M 1089k| 0 0 | 13k 18k
0 2 97 0 0 1| 0 0 | 700M 1077k| 0 0 | 13k 18k
0 2 97 1 0 0| 0 293M| 641M 962k| 0 0 | 10k 16k
0 1 97 1 0 0| 0 355M| 599M 891k| 0 0 |8955 14k
0 1 97 2 0 0| 0 344M| 634M 946k| 0 0 | 11k 16k
0 4 96 0 0 0| 0 8196k| 688M 1042k| 0 0 | 11k 17k
0 2 97 1 0 0| 0 203M| 642M 962k| 0 0 | 10k 16k
0 1 97 1 0 0| 0 391M| 608M 878k| 0 0 |9231 14k
可以看到,此时不仅仅有磁盘写,而且还有网络流量600~700MB