目录
1 监控io性能
- iostat -x 磁盘使用
[root@worker1 ~]# iostat -x
Linux 3.10.0-327.el7.x86_64 (worker1) 10/15/2018 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.02 0.00 0.12 0.10 0.00 99.77
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
scd0 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 42.45 42.45 0.00 42.45 0.00
sdb 0.00 0.00 0.01 0.00 0.07 0.00 12.54 0.00 2.07 2.07 0.00 1.54 0.00
sda 0.00 0.01 0.24 0.20 5.17 0.53 25.94 0.00 8.76 12.54 4.06 4.51 0.20
dm-0 0.00 0.00 0.00 0.00 0.02 0.00 15.46 0.00 2.78 2.78 0.00 2.78 0.00
关注%util这一列
- iotop 磁盘使用
[root@worker1 ~]# yum install -y iotop
[root@worker1 ~]# iotop
2 free命令
- free 命令查看内存使用状况
- total:内存总大小
- used:真正使用的实际内存大小
- free:剩余物理内存大小(没有被分配,纯剩余)
- shared:共享内存大小,不用关注它
buff/cache:分配给buffer和cache的内存总共有多大。关于buffer和cache大家也许有一些疑惑,因为字面意思上两者很相近。有一个很容易区分这两者的方法,buffer和cache都是一部分内存,内存的作用就是缓解CPU和IO(如,磁盘)的速度差距的,你可以这样理解:数据经过CPU计算,即将要写入磁盘,这时用的内存为buffer;CPU要计算时,需要把数据从磁盘中读出来,临时先放到内存中,这部分内存就是cache
available:系统可使用内存有多大,它包含了free。Linux系统为了让应用跑得更快,会预先分配一部分内存(buffer/cache)给某些应用使用,虽然这部分内存并没有真正使用,但也已经分配出去了。然而,当另外一个服务要使用更多内存时,是可以把这部分预先分配的内存拿来用的。所以还没有被占用的这部分buffer和cache再加上free就是available
这个free命令显示的结果中,其实有一个隐藏的公式:total=used+free+buff/cache。另外,available是由free这部分内存和buff/cache还未被占用的那部分内存组成。used那部分内存和buff/cache被占用的内存是没有关系的
[root@worker1 ~]# free -m
total used free shared buff/cache available
Mem: 1832 112 1419 8 300 1569
Swap: 1906 0 1906
[root@worker1 ~]# free -g
total used free shared buff/cache available
Mem: 1 0 1 0 0 1
Swap: 1 0 1
[root@worker1 ~]# free -h
total used free shared buff/cache available
Mem: 1.8G 112M 1.4G 8.5M 300M 1.5G
Swap: 1.9G 0B 1.9G
3 ps命令
- ps 命令查看系统进程
[root@worker1 ~]# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.2 41560 3932 ? Ss 09:23 0:03 /usr/lib/systemd/systemd --switched-root --sy
root 2 0.0 0.0 0 0 ? S 09:23 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 09:23 0:00 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S< 09:23 0:00 [kworker/0:0H]
root 7 0.0 0.0 0 0 ? S 09:23 0:01 [migration/0]
root 8 0.0 0.0 0 0 ? S 09:23 0:00 [rcu_bh]
root 9 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/0]
root 10 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/1]
root 11 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/2]
root 12 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/3]
root 13 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/4]
root 14 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/5]
root 15 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/6]
root 16 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/7]
root 17 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/8]
root 18 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/9]
root 19 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/10]
root 20 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/11]
root 21 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/12]
root 22 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/13]
root 23 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/14]
root 24 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/15]
root 25 0.0 0.0 0 0 ? S 09:23 0:00 [rcuob/16]
PID:表示进程的ID,这个ID很有用。在Linux中,内核管理进程就得靠pid来识别和管理某一个进程。比如我想终止某一个进程,则用命令“kill 进程的pid”。有时这样并不能终止进程,需要加-9选项,即“kill -9 进程的pid”,但这样有点暴力,严重的时候会丢数据,所以尽量还是别用
STAT:进程的状态。进程状态分为以下几种(不要求记住,但要了解)
D:不能中断的进程(通常为IO)
R(run):正在运行中的进程,其中包括了等待CPU时间片的进程
S(sleep):已经中断的进程。通常情况下,系统的大部分进程都是这个状态
T:已经停止或者暂停的进程。如果我们正在运行一个命令,比如说sleep 10,我们按一下Ctrl+Z暂停进程时,用ps命令查看就会显示T这个状态
W:(内核2.6xx以后不可用),没有足够的内存页分配
X:已经死掉的进程(这个好像从来不会出现)
Z:僵尸进程,即杀不掉、打不死的垃圾进程,占用系统一点资源,不过没有关系。如果占用太多(一般不会出现),就需要重视了
<:高优先级进程
N:低优先级进程
L:在内存中被锁了内存分页
s:主进程
l:多线程进程
+:在前台运行的进程,比如在当前终端执行ps aux就是前台进程
4 查看网络状态
netstat 查看网络状态
- netstat -lnp 查看监听端口
[root@worker1 ~]# netstat -lnp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1276/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 2405/master
tcp 0 0 127.0.0.1:6010 0.0.0.0:* LISTEN 2440/sshd: root@pts
tcp6 0 0 :::22 :::* LISTEN 1276/sshd
tcp6 0 0 ::1:25 :::* LISTEN 2405/master
tcp6 0 0 ::1:6010 :::* LISTEN 2440/sshd: root@pts
raw6 0 0 :::58 :::* 7 720/NetworkManager
Active UNIX domain sockets (only servers)
Proto RefCnt Flags Type State I-Node PID/Program name Path
unix 2 [ ACC ] STREAM LISTENING 15690 720/NetworkManager /var/run/NetworkManager/private
unix 2 [ ACC ] STREAM LISTENING 18438 2405/master private/tlsmgr
unix 2 [ ACC ] STREAM LISTENING 12113 1/systemd /run/lvm/lvmetad.socket
unix 2 [ ACC ] STREAM LISTENING 18179 2405/master public/pickup
unix 2 [ ACC ] STREAM LISTENING 17407 2405/master public/cleanup
unix 2 [ ACC ] STREAM LISTENING 18434 2405/master public/qmgr
unix 2 [ ACC ] STREAM LISTENING 18456 2405/master public/flush
unix 2 [ ACC ] STREAM LISTENING 18471 2405/master public/showq
- netstat -an 查看系统的网络连接状况
[root@worker1 ~]# netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:6010 0.0.0.0:* LISTEN
tcp 0 52 192.168.139.100:22 192.168.139.1:2323 ESTABLISHED
tcp6 0 0 :::22 :::* LISTEN
tcp6 0 0 ::1:25 :::* LISTEN
tcp6 0 0 ::1:6010 :::* LISTEN
raw6 0 0 :::58 :::* 7
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node Path
unix 2 [ ACC ] STREAM LISTENING 15690 /var/run/NetworkManager/private
unix 2 [ ACC ] STREAM LISTENING 18438 private/tlsmgr
unix 2 [ ACC ] STREAM LISTENING 12113 /run/lvm/lvmetad.socket
unix 2 [ ACC ] STREAM LISTENING 18179 public/pickup
unix 2 [ ACC ] STREAM LISTENING 17407 public/cleanup
unix 2 [ ACC ] STREAM LISTENING 18434 public/qmgr
unix 2 [ ACC ] STREAM LISTENING 18456 public/flush
- netstat -lntp 只看出tcp的,不包含socket
- ss -an 和nestat异曲同工
[root@worker1 ~]# netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:6010 0.0.0.0:* LISTEN
tcp 0 52 192.168.139.100:22 192.168.139.1:2323 ESTABLISHED
tcp6 0 0 :::22 :::* LISTEN
tcp6 0 0 ::1:25 :::* LISTEN
tcp6 0 0 ::1:6010 :::* LISTEN
raw6 0 0 :::58 :::* 7
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node Path
unix 2 [ ACC ] STREAM LISTENING 15690 /var/run/NetworkManager/private
unix 2 [ ACC ] STREAM LISTENING 18438 private/tlsmgr
unix 2 [ ACC ] STREAM LISTENING 12113 /run/lvm/lvmetad.socket
unix 2 [ ACC ] STREAM LISTENING 18179 public/pickup
unix 2 [ ACC ] STREAM LISTENING 17407 public/cleanup
unix 2 [ ACC ] STREAM LISTENING 18434 public/qmgr
unix 2 [ ACC ] STREAM LISTENING 18456 public/flush
[root@worker1 ~]# ss -an
Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port
nl UNCONN 0 0 0:-432012592 *
nl UNCONN 0 0 0:-1149238576 *
nl UNCONN 0 0 0:0 *
nl UNCONN 0 0 0:-432012592 *
nl UNCONN 768 0 4:0 *
nl UNCONN 4352 0 4:3668 *
nl UNCONN 0 0 6:0 *
nl UNCONN 0 0 7:0 *
nl UNCONN 0 0 9:0 *
nl UNCONN 0 0 9:1 *
nl UNCONN 0 0 9:615 *
nl UNCONN 0 0 10:0 *
nl UNCONN 0 0 11:0 *
nl UNCONN 0 0 15:1277 *
nl UNCONN 0 0 15:-4121 *
- 分享一个小技巧:
[root@worker1 ~]# netstat -an | awk '/^tcp/{sta[$NF]++} END {for(key in sta) print key,"\t",sta[key]}'
LISTEN 6
ESTABLISHED 1
5 linux下抓包
抓包工具tcpdump,用法:tcpdump -nn
[root@worker1 ~]# yum install -y tcpdump
- 抓取网卡eno16777736的数据包
[root@worker1 ~]# tcpdump -nn -i eno16777736
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eno16777736, link-type EN10MB (Ethernet), capture size 262144 bytes
16:41:04.054243 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 565002767:565002979, ack 1018064717, win 318, length 212
16:41:04.054863 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 212:408, ack 1, win 318, length 196
16:41:04.054962 IP 192.168.139.1.2323 > 192.168.139.100.22: Flags [.], ack 212, win 251, length 0
16:41:04.055302 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 408:684, ack 1, win 318, length 276
16:41:04.056905 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 684:848, ack 1, win 318, length 164
16:41:04.057209 IP 192.168.139.1.2323 > 192.168.139.100.22: Flags [.], ack 684, win 256, length 0
16:41:04.057239 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 848:1012, ack 1, win 318, length 164
16:41:04.057561 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 1012:1288, ack 1, win 318, length 276
16:41:04.057693 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 1288:1468, ack 1, win 318, length 180
16:41:04.057766 IP 192.168.139.1.2323 > 192.168.139.100.22: Flags [.], ack 1012, win 255, length 0
16:41:04.057837 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 1468:1744, ack 1, win 318, length 276
16:41:04.057951 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 1744:1924, ack 1, win 318, length 180
16:41:04.058731 IP 192.168.139.1.2323 > 192.168.139.100.22: Flags [.], ack 1924, win 251, length 0
16:41:04.058869 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 1924:2104, ack 1, win 318, length 180
16:41:04.059736 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 2104:2380, ack 1, win 318, length 276
16:41:04.060880 IP 192.168.139.1.2323 > 192.168.139.100.22: Flags [.], ack 2380, win 256, length 0
16:41:04.061539 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 2380:2656, ack 1, win 318, length 276
16:41:04.061672 IP 192.168.139.100.22 > 192.168.139.1.2323: Flags [P.], seq 2656:2836, ack 1, win 318, length 180
-i选项后面跟设备名称,如果想抓取其他网卡的数据包,后面则要跟其他网卡的名字。-nn选项的作用是让第3列和第4列显示成“IP+端口号”的形式,如果不加-nn选项则显示 “主机名+服务名称
- 抓取80端口的数据包
[root@worker1 ~]# tcpdump -nn -i eno16777736 port 80
- 抓取ip192.168.0.100不是22端口的数据包
[root@worker1 ~]#tcpdump -nn not port 22 and host 192.168.0.100
- 抓取100个数据包保存到1.cap
[root@worker1 ~]#tcpdump -nn -c 100 -w 1.cap
- 查看抓取数据包的文件
[root@worker1 ~]#tcpdump -r 1.cap
wireshark 工具
[root@worker1 ~]#yum install -y wireshark
- tshark常用命令,这条命令用于Web服务器
[root@worker1 ~]#tshark -n -t a -R http.request -T fields -e "frame.time" -e "ip.src" -e "http.host" -e "http.request.method" -e "http.request.uri"
扩展tcp三次握手四次挥手 http://www.doc88.com/p-9913773324388.html
tshark几个用法:http://www.aminglinux.com/bbs/thread-995-1-1.html