第二章 服务性能监控与分析
2.1 Linux服务器性能监控与分析
2.1.4 从lsof中能看到什么
lsof是linux系统中对文件进行监控的一个常用命令。使用该命令可以列出当前系统打开了那些文件、系统中某个进程打开了那些文件、打开了那些网络端口等信息。
1.lsof命令
直接执行lsof命令,即可以显示当前操作系统打开了那些文件。lsof命令必须root用户执行,因为lsof执行需要访问核心内存和内核文件。
[root@1a01vlb9935zzzz ~]# lsof
COMMAND PID TID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root cwd DIR 253,0 4096 128 /
systemd 1 root rtd DIR 253,0 4096 128 /
systemd 1 root txt REG 253,0 1612152 201494383 /usr/lib/systemd/systemd
systemd 1 root mem REG 253,0 20112 134813502 /usr/lib64/libuuid.so.1.3.0
systemd 1 root mem REG 253,0 261456 134813159 /usr/lib64/libblkid.so.1.1.0
systemd 1 root mem REG 253,0 90664 134322465 /usr/lib64/libz.so.1.2.7
systemd 1 root mem REG 253,0 157424 134322466 /usr/lib64/liblzma.so.5.2.2
systemd 1 root 74u netlink 0t0 13430 KOBJECT_UEVENT
kthreadd 2 root cwd DIR 253,0 4096 128 /
kthreadd 2 root rtd DIR 253,0 4096 128 /
kthreadd 2 root txt unknown /proc/2/exe
ksoftirqd 3 root cwd DIR 253,0 4096 128 /
ksoftirqd 3 root rtd DIR 253,0 4096 128 /
ksoftirqd 3 root txt unknown /proc/3/exe
kworker/0 5 root cwd DIR 253,0 4096 128 /
kworker/0 5 root rtd DIR 253,0 4096 128 /
kworker/0 5 root txt unknown /proc/5/exe
lsof 46413 root 0u CHR 136,0 0t0 3 /dev/pts/0
lsof 46413 root 1u CHR 136,0 0t0 3 /dev/pts/0
lsof 46413 root 2u CHR 136,0 0t0 3 /dev/pts/0
lsof 46413 root 3r DIR 0,3 0 1 /proc
lsof 46413 root 4r DIR 0,3 0 42763103 /proc/46413/fd
sshd 71065 root 1u unix 0xffff8db5b1f71000 0t0 7033487 socket
sshd 71065 root 2u unix 0xffff8db5b1f71000 0t0 7033487 socket
sshd 71065 root 3u IPv4 7034332 0t0 TCP *:ssh (LISTEN)
...
- COMMAND :指进程的名称
- PID :进程ID
- TID : 进程对应的线程ID
- USER : 进程的所有者,也就是进程运行在哪个linux用户下
- FD : 文件描述符 File Descriptor
文件描述符 | 英文全称 | 中文解释 |
cwd | current working directory | 当前工作目录 |
mem | memory-mapped file | 代表把磁盘文件映射到内存中 |
txt | program text | 进程运行文件,包括编译后的代码文件以及产生数据的文件,nginx命令文件就是txt类型 |
rtd | root directory | 代表root 目录 |
ltx | shared library text | 共享lib数据 |
m86 | DOS merge mapped file | 合并映射文件 |
mmap | memory-mapped device | 代表把磁盘设备映射到内存中 |
err | FD information err | 文件描述错误信息 |
tr | kernel trace file | 内核跟踪文件 |
DEL | a linux map file that has been deleted | 代表已经删除的文件 |
数字+字符:如0u,1w,2w | 0:表示标准输出 1:表示标准输入 2:表示标准错误 u:表示该文件被打开并处于读取/写入模式 r: 表示该文件被打开处于只读模式 w:表示该文件被分开并处于只写入模式 |
- TYPE:打开的文件类型
类型 | 英文全称 | 含义 |
DIR | directory | 代表了一个文件目录 |
CHR | character special file | 特殊字符文件 |
LINK | symbolic link file | 链接文件 |
IPv4 | IPv4 socket | ipv4套接字文件 |
IPv6 | IPv6 network file | ipv6的网络文件 |
REG | regular file | 普通文件 |
FIFO | FIFO special fiel | 先进先出队列文件 |
unix | Unix domin socket 套接字 | unix下的域套接字, |
MPB | multiplexed block file | 多路复用的块文件 |
MPC | multiplexed character file | 多路复用的字符文件 |
inet | Intent | Intent 域套接字 |
- DEVICE : 设备号
- SIZE:文件的大小,前提是文件有效
- NODE:操作系统本地文件的node number 或者服务器主机中NFS的innode number,或者协议类型。
- NAME :文件的绝对路径或者网络通信连接的地址、端口、状态或者挂在点等
2.lsof命令的其他参数
1. lsof -c : 查看某个进程名称当前打开了那些文件
[root@1a01vlb9508zzzz ~]# lsof -c nginx
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 34399 root cwd DIR 253,0 4096 269137208 /etc/nginx
nginx 34399 root rtd DIR 253,0 4096 128 /
nginx 34399 root txt REG 253,0 1132976 152725815 /usr/sbin/nginx
nginx 34399 root mem REG 253,0 31824 134347207 /usr/lib64/libnss_dns-2.17.so
nginx 34399 root mem REG 253,0 62184 134347289 /usr/lib64/libnss_files-2.17.so
nginx 34399 root mem REG 253,0 155784 134813212 /usr/lib64/libselinux.so.1
nginx 34399 root mem REG 253,0 106848 134398458 /usr/lib64/libresolv-2.17.so
nginx 34399 root mem REG 253,0 15688 134347434 /usr/lib64/libkeyutils.so.1.5
nginx 34399 root mem REG 253,0 58728 135286936 /usr/lib64/libkrb5support.so.0.1
nginx 34399 root mem REG 253,0 88720 134322078 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
nginx 34399 root mem REG 253,0 1139680 134347089 /usr/lib64/libm-2.17.so
nginx 34399 root mem REG 253,0 995840 134322438 /usr/lib64/libstdc++.so.6.0.19
nginx 34399 root mem REG 253,0 210776 134398450 /usr/lib64/libk5crypto.so.3.1
nginx 34399 root mem REG 253,0 15848 134322460 /usr/lib64/libcom_err.so.2.1
nginx 34399 root mem REG 253,0 963536 134398462 /usr/lib64/libkrb5.so.3.3
nginx 34399 root mem REG 253,0 320840 134398447 /usr/lib64/libgssapi_krb5.so.2.2
nginx 34399 root mem REG 253,0 11464 134322055 /usr/lib64/libfreebl3.so
nginx 34399 root mem REG 253,0 2173512 134322407 /usr/lib64/libc-2.17.so
nginx 34399 root mem REG 253,0 65904 152725797 /usr/lib64/libprofiler.so.0.4.14
nginx 34399 root mem REG 253,0 90664 134322465 /usr/lib64/libz.so.1.2.7
nginx 34399 root mem REG 253,0 2512832 134814137 /usr/lib64/libcrypto.so.1.0.2k
nginx 34399 root mem REG 253,0 470360 135286913 /usr/lib64/libssl.so.1.0.2k
nginx 34399 root mem REG 253,0 402384 134322442 /usr/lib64/libpcre.so.1.2.0
nginx 34399 root mem REG 253,0 41080 134347083 /usr/lib64/libcrypt-2.17.so
nginx 34399 root mem REG 253,0 144792 134347467 /usr/lib64/libpthread-2.17.so
nginx 34399 root mem REG 253,0 19776 134347086 /usr/lib64/libdl-2.17.so
nginx 34399 root mem REG 253,0 164240 134322447 /usr/lib64/ld-2.17.so
nginx 34399 root DEL REG 0,4 11910135 /dev/zero
nginx 34399 root 0u CHR 1,3 0t0 1028 /dev/null
nginx 34399 root 1u CHR 1,3 0t0 1028 /dev/null
nginx 34399 root 2w REG 253,0 0 346533344 /var/log/nginx/error.log
nginx 34399 root 4u unix 0xffff88f59ddd2800 0t0 25256164 socket
nginx 34399 root 5u unix 0xffff88f59ddd0c00 0t0 25256165 socket
nginx 34399 root 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 34399 root 7w REG 253,0 99534760 152725845 /usr/share/nginx/logs/access.log
nginx 34399 root 8w REG 253,0 0 346533344 /var/log/nginx/error.log
nginx 34399 root 10u unix 0xffff88f59ddd0400 0t0 25256166 socket
nginx 34399 root 11u unix 0xffff88f59ddd4400 0t0 25256167 socket
nginx 34399 root 12u unix 0xffff88f59ddd6800 0t0 25256168 socket
nginx 34399 root 13u unix 0xffff88f59ddd2c00 0t0 25256169 socket
nginx 34399 root 14u unix 0xffff88f59ddd5c00 0t0 25256170 socket
nginx 34399 root 15u unix 0xffff88f59ddd7400 0t0 25256171 socket
nginx 34399 root 16u unix 0xffff88f7316c2000 0t0 25256172 socket
nginx 34399 root 17u unix 0xffff88f7316c6400 0t0 25256173 socket
nginx 34399 root 18u unix 0xffff88f7316c6000 0t0 25256174 socket
nginx 34399 root 19u unix 0xffff88f7316c7c00 0t0 25256175 socket
nginx 34399 root 20u unix 0xffff88f7316c0000 0t0 25256176 socket
nginx 34399 root 21u unix 0xffff88f7316c2400 0t0 25256177 socket
nginx 36454 nginx cwd DIR 253,0 4096 269137208 /etc/nginx
nginx 36454 nginx rtd DIR 253,0 4096 128 /
nginx 36454 nginx txt REG 253,0 1132976 152725815 /usr/sbin/nginx
nginx 36454 nginx mem REG 253,0 31824 134347207 /usr/lib64/libnss_dns-2.17.so
nginx 36454 nginx mem REG 253,0 62184 134347289 /usr/lib64/libnss_files-2.17.so
nginx 36454 nginx mem REG 253,0 155784 134813212 /usr/lib64/libselinux.so.1
nginx 36454 nginx mem REG 253,0 106848 134398458 /usr/lib64/libresolv-2.17.so
2. lsof -p : 查看某个进程id当前打开了那些文件
[root@1a01vlb9508zzzz ~]# lsof -p 1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root cwd DIR 253,0 4096 128 /
systemd 1 root rtd DIR 253,0 4096 128 /
systemd 1 root txt REG 253,0 1612152 201494383 /usr/lib/systemd/systemd
systemd 1 root mem REG 253,0 20112 134813502 /usr/lib64/libuuid.so.1.3.0
systemd 1 root mem REG 253,0 261456 134813159 /usr/lib64/libblkid.so.1.1.0
systemd 1 root mem REG 253,0 90664 134322465 /usr/lib64/libz.so.1.2.7
systemd 1 root mem REG 253,0 157424 134322466 /usr/lib64/liblzma.so.5.2.2
systemd 1 root mem REG 253,0 23968 134347216 /usr/lib64/libcap-ng.so.0.0.0
systemd 1 root mem REG 253,0 19896 134347172 /usr/lib64/libattr.so.1.1.0
systemd 1 root mem REG 253,0 19776 134347086 /usr/lib64/libdl-2.17.so
3. lsof -i : 查看ipv4 和 ipv6 下打开的文件,此时看到的大部分都是网络的连接通信,会包括服务器端的LISTEN监听或者客户端和服务端的网络通信
[root@1a01vlb9508zzzz ~]# lsof -i
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
chronyd 1338 chrony 1u IPv4 19932 0t0 UDP localhost:323
YDService 1453 root 6u IPv4 22892 0t0 TCP 1a01vlb9508zzzz:46825->10.203.194.187:lsi-bobcat (ESTABLISHED)
python 1833 root 12u IPv4 24662 0t0 TCP 1a01vlb9508zzzz:33858->100.80.96.43:netware-csp (ESTABLISHED)
oracle 3997 oracle 14u IPv4 66575413 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:60305 (ESTABLISHED)
oracle 4011 oracle 14u IPv4 66573659 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:60307 (ESTABLISHED)
oracle 7346 oracle 14u IPv4 66577879 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.28:57113 (ESTABLISHED)
oracle 8711 oracle 14u IPv4 66580772 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:52038 (ESTABLISHED)
oracle 23409 oracle 14u IPv4 66599907 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:51500 (ESTABLISHED)
oracle 30009 oracle 14u IPv4 65437538 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:51708 (ESTABLISHED)
oracle 30121 oracle 14u IPv4 66611082 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:60106 (ESTABLISHED)
nginx 34399 root 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
sshd 36171 root 3u IPv4 66621082 0t0 TCP 1a01vlb9508zzzz:ssh->100.119.92.19:12931 (ESTABLISHED)
sshd 36175 root 3u IPv4 66621114 0t0 TCP 1a01vlb9508zzzz:ssh->100.119.92.19:invision (ESTABLISHED)
oracle 36217 oracle 14u IPv4 66622182 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.179:49176 (ESTABLISHED)
nginx 36454 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36455 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36456 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36457 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36458 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36459 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36460 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
oracle 39608 oracle 14u IPv4 66624042 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:54785 (ESTABLISHED)
epmd 41708 rabbitmq 3u IPv4 92828 0t0 TCP *:epmd (LISTEN)
epmd 41708 rabbitmq 4u IPv4 92861 0t0 TCP localhost:epmd->localhost:36877 (ESTABLISHED)
beam.smp 41783 rabbitmq 44u IPv4 92032 0t0 TCP *:25672 (LISTEN)
beam.smp 41783 rabbitmq 45u IPv4 92034 0t0 TCP localhost:36877->localhost:epmd (ESTABLISHED)
beam.smp 41783 rabbitmq 53u IPv4 92137 0t0 TCP *:amqp (LISTEN)
beam.smp 41783 rabbitmq 55u IPv4 93641 0t0 TCP *:15672 (LISTEN)
oracle 60932 oracle 17u IPv4 5113619 0t0 UDP localhost:36852
oracle 60932 oracle 19u IPv4 12062302 0t0 TCP localhost:24383->localhost:ncube-lm (ESTABLISHED)
oracle 60966 oracle 20u IPv4 5118451 0t0 UDP *:47387
oracle 60970 oracle 30u IPv4 6087940 0t0 UDP *:10766
oracle 60986 oracle 24u IPv4 5120364 0t0 UDP *:15418
oracle 60994 oracle 16u IPv4 5114025 0t0 UDP localhost:52066
oracle 60994 oracle 17u IPv4 5114030 0t0 TCP *:12161 (LISTEN)
oracle 60998 oracle 16u IPv4 5110705 0t0 UDP localhost:30091
tnslsnr 65415 oracle 11u IPv4 12062270 0t0 TCP *:ncube-lm (LISTEN)
tnslsnr 65415 oracle 13u IPv4 12063042 0t0 TCP localhost:ncube-lm->localhost:24383 (ESTABLISHED)
oracle 70036 oracle 14u IPv4 65105564 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.183:50708 (ESTABLISHED)
sshd 71039 root 3u IPv4 54214531 0t0 TCP 1a01vlb9508zzzz:ssh->100.80.96.18:15956 (ESTABLISHED)
oracle 81973 oracle 14u IPv4 66495018 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:57654 (ESTABLISHED)
sshd 87974 root 3u IPv4 36798266 0t0 TCP *:ssh (LISTEN)
oracle 101537 oracle 14u IPv4 66524595 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:47708 (ESTABLISHED)
oracle 120853 oracle 14u IPv4 66552372 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:37334 (ESTABLISHED)
也lsof -i:<port number> 可以加上端口号,查看固定端口下的网络连接情况
[root@1a01vlb9508zzzz ~]# lsof -i:1521
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
oracle 3997 oracle 14u IPv4 66575413 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:60305 (ESTABLISHED)
oracle 4011 oracle 14u IPv4 66573659 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:60307 (ESTABLISHED)
oracle 7346 oracle 14u IPv4 66577879 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.28:57113 (ESTABLISHED)
oracle 8711 oracle 14u IPv4 66580772 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:52038 (ESTABLISHED)
oracle 23409 oracle 14u IPv4 66599907 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:51500 (ESTABLISHED)
oracle 30009 oracle 14u IPv4 65437538 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:51708 (ESTABLISHED)
oracle 30121 oracle 14u IPv4 66611082 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:60106 (ESTABLISHED)
oracle 36217 oracle 14u IPv4 66622182 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.179:49176 (ESTABLISHED)
oracle 43840 oracle 14u IPv4 66633996 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:65392 (ESTABLISHED)
oracle 60932 oracle 19u IPv4 12062302 0t0 TCP localhost:24383->localhost:ncube-lm (ESTABLISHED)
tnslsnr 65415 oracle 11u IPv4 12062270 0t0 TCP *:ncube-lm (LISTEN)
tnslsnr 65415 oracle 13u IPv4 12063042 0t0 TCP localhost:ncube-lm->localhost:24383 (ESTABLISHED)
oracle 70036 oracle 14u IPv4 65105564 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.183:50708 (ESTABLISHED)
oracle 81973 oracle 14u IPv4 66495018 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:57654 (ESTABLISHED)
oracle 101537 oracle 14u IPv4 66524595 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:47708 (ESTABLISHED)
oracle 120853 oracle 14u IPv4 66552372 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.182:37334 (ESTABLISHED)
[root@1a01vlb9508zzzz ~]# lsof -i:80
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 34399 root 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36454 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36455 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36456 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36457 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36458 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36459 nginx 3u IPv4 66641671 0t0 TCP 1a01vlb9508zzzz:http->100.119.92.19:mainsoft-lm (ESTABLISHED)
nginx 36459 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36460 nginx 3u IPv4 66643223 0t0 TCP 1a01vlb9508zzzz:http->100.119.92.19:10919 (ESTABLISHED)
nginx 36460 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
这台机器上安装了oracle和nginx:分别查看1521端口和80端口 ,展示了其TCP连接通信情况;有100.119.92.17,100.119.92.28 等电脑与1521端口进行了TCP连接,而且TCP连接上状态是ESTABLISHED,并且同时占用了服务器上的60305 ,57113端口,这个端口是随机的,有关TCP连接状态和优化TCP连接的,参考 网络通信协议TCP: 后续再补
4. lsof +d 路径 : 列出指定目录下被使用的文件 ,lsof +D 路径 : 大写D会递归的显示目录下所有子目录被使用的文件
5. lsof 绝对路径 :列出该文件正在被那些进程使用
6. lsof -i tcp : 显示所有tcp连接, lsof -i tcp:80 显示所有tcp下80端口的网络连接信息。tcp也可以换成udp等
[root@1a01vlb9508zzzz nginx]# lsof -i tcp
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
YDService 1453 root 6u IPv4 22892 0t0 TCP 1a01vlb9508zzzz:46825->10.203.194.187:lsi-bobcat (ESTABLISHED)
python 1833 root 12u IPv4 24662 0t0 TCP 1a01vlb9508zzzz:33858->100.80.96.43:netware-csp (ESTABLISHED)
oracle 3997 oracle 14u IPv4 66575413 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:60305 (ESTABLISHED)
oracle 4011 oracle 14u IPv4 66573659 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:60307 (ESTABLISHED)
oracle 8711 oracle 14u IPv4 66580772 0t0 TCP 1a01vlb9508zzzz:ncube-lm->100.119.92.17:52038 (ESTABLISHED)
oracle 22129 oracle 14u IPv4 66796351 0t0 TCP 1a01vlb9508zzzz:ncube-lm->10.206.9.179:57280 (ESTABLISHED)
sshd 32119 root 3u IPv4 66812200 0t0 TCP 1a01vlb9508zzzz:ssh->100.119.92.19:5592 (ESTABLISHED)
sshd 32126 root 3u IPv4 66812237 0t0 TCP 1a01vlb9508zzzz:ssh->100.119.92.19:5596 (ESTABLISHED)
nginx 34399 root 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36454 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36455 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36456 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36457 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
.....
[root@1a01vlb9508zzzz nginx]# lsof -i udp
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
chronyd 1338 chrony 1u IPv4 19932 0t0 UDP localhost:323
oracle 60932 oracle 17u IPv4 5113619 0t0 UDP localhost:36852
oracle 60966 oracle 20u IPv4 5118451 0t0 UDP *:47387
oracle 60970 oracle 30u IPv4 6087940 0t0 UDP *:10766
oracle 60986 oracle 24u IPv4 5120364 0t0 UDP *:15418
oracle 60994 oracle 16u IPv4 5114025 0t0 UDP localhost:52066
oracle 60998 oracle 16u IPv4 5110705 0t0 UDP localhost:30091
[root@1a01vlb9508zzzz nginx]# lsof -i tcp:80
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 34399 root 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36454 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36455 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36456 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36457 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36458 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36459 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
nginx 36460 nginx 6u IPv4 11910134 0t0 TCP *:http (LISTEN)
[root@1a01vlb9508zzzz nginx]#
2.1.5 通过free看懂内存的使用
free是linux操作系统对内存进行查看和监控的常用命令
[root@1a01vlb9508zzzz nginx]# free
total used free shared buff/cache available
Mem: 7907328 515544 1382092 1889256 6009692 5159096
Swap: 4063228 350976 3712252
[root@1a01vlb9508zzzz nginx]# free -h
total used free shared buff/cache available
Mem: 7.5G 502M 1.3G 1.8G 5.7G 4.9G
Swap: 3.9G 342M 3.5G
Mem : 是物理内存(DRAM)的使用情况 , Swap :是内存交换区(虚拟内存)的使用情况
total 是总的可用资源, used 已经使用的资源,free 没有被使用的资源,shared 显示的是共享区占用的物理内存大小
buff/cache : 显示的是被缓冲区和page缓存合计使用的物理内存大小,单位为k
- buff : 在操作系统中指的是缓冲区,负责磁盘的块设备读写缓存,会直接占用系统的物理内存
- cache : 操作系统中的page缓存,是linux内核实现的磁盘缓存,就是将磁盘中的数据缓存到物理内存中,以减少内核对磁盘的I/O读写操作,这样磁盘的访问就会变成对物理内存的访问,提高了系统对磁盘的读写速度。
available :可用的物理内存大小 free + buff/cache , free是真正没有被使用的内存,available 可用内存,在应用程序需要内存时内核会回收buff/cache占用的物理内存,满足应用程序的需要;
2.1.6 通过top发现问题
通过top命令来发现和定位服务器性能消耗的问题
top - 15:49:26 up 4 days, 20:45, 1 user, load average: 10.75, 12.52, 13.49
Tasks: 10 total, 1 running, 9 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.6%us, 8.0%sy, 0.0%ni, 80.8%id, 0.0%wa, 0.0%hi, 1.6%si, 0.0%st
Mem: 791222028k total, 622245640k used, 168976388k free, 2104k buffers
Swap: 0k total, 0k used, 0k free, 343987268k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17 appdeplo 20 0 30.3g 3.9g 19m S 17.6 0.5 290:23.69 java
1 appdeplo 20 0 105m 2996 2748 S 0.0 0.0 0:00.12 sh
18 appdeplo 20 0 98.6m 1592 1504 S 0.0 0.0 0:00.00 tailf
1667 appdeplo 20 0 105m 3096 2716 S 0.0 0.0 0:00.00 bash
1690 appdeplo 20 0 98.6m 536 464 S 0.0 0.0 0:05.56 tail
1895 appdeplo 20 0 105m 3112 2720 S 0.0 0.0 0:00.02 bash
2142 appdeplo 20 0 105m 3148 2772 S 0.0 0.0 0:00.02 bash
2165 appdeplo 20 0 98.6m 500 424 S 0.0 0.0 0:02.45 tail
2310 appdeplo 20 0 105m 3196 2820 S 0.0 0.0 0:00.01 bash
2334 appdeplo 20 0 14960 1972 1756 R 0.0 0.0 0:00.37 top
- 第一行 :top - 15:49:26 up 4 days, 20:45, 1 user, load average: 10.75, 12.52, 13.49 系统运行信息 :当前时间 15:49:26 ,系统运行4天20小时,当前登录用户有1个,系统平均负载压力: 10.75(一分钟负载压力),12.52(5分钟负载压力), 13.49(15分钟负载压力);平均负载压力是每隔5秒钟检查一次活跃的进程数,然后按特定算法计算出来的,一般这个数除以CPU核数的值大于3~5时,就表明系统的负载压力已经超高了;
- 第二行 :Tasks: 10 total, 1 running, 9 sleeping, 0 stopped, 0 zombie 任务信息 : 总进程10个,1个进程占用CPU处于运行状态,9个进程正在休眠中、0个进程停止,0个进程假死
- 第三行 :Cpu(s): 9.6%us, 8.0%sy, 0.0%ni, 80.8%id, 0.0%wa, 0.0%hi, 1.6%si, 0.0%st cpu运行信息 : 9.6%us用户模式下cpu占比, 8.0%sy系统模式下CPU占比,0.0%ni 改变优先级的进程cpu占比,80.8%id表示空闲状态的cpu占比,0.0%hi表示硬中断的CPU占比, 1.6%si软中断的CPU占比,0.0%st 表示CPU等待虚拟机调度的时间占比,虚拟机中才会有,在物理机中维持为0.
- 第四行 :Mem: 791222028k total, 622245640k used, 168976388k free, 2104k buffers 内存使用信息 total 是总物理内存, used 已使用物理内存,free 空闲物理内存,buffers用于缓存的物理内存大小
- 第五行 :Swap: 0k total, 0k used, 0k free, 343987268k cached 虚拟内存swap的使用信息 , total 虚拟内存空间的大小,used使用的虚拟内存大小,free空间的虚拟内存大小,avail Mem 表示可供使用的内存大小
- 第六行 : PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 进程资源消耗信息
- PID 进程ID
- USER 进程持有用户
- PR 进程的优先级,值越小优先级越高,会越早获得CPU的执行权
- NI 进程的nice值,进程的优先级可被修改,最终PR+Nice 越小优先级越高
- VIRT 进程使用的虚拟内存大小 单位为KB
- RES 进程使用的并且未被虚拟内存换出的物理内存大小,一般也称为常驻内存,单位为KB
- SHR 进程使用的共享内存大小
- S 进程的当前运行状态 , D 不可中断的睡眠状态,R 运行中, S休眠中,T跟踪/停止 Z 假死
- %CPU 进程运行时的CPU占比
- %MEM 进程使用的内存占比
- TIME+ 进程占用的CPU总时长
- COMMAND 进程启动运行的命令
TOP 命令其他用户 top -p pid 查看指定进程的信息, top -H -p pid 查看指定进程的所有线程信息
[appdeploy@x5-server-web-uat-77f6568cf5-xgx4k deploy]$ top -H -p 17
top - 16:46:09 up 73 days, 3:33, 1 user, load average: 7.78, 12.23, 37.14
Tasks: 374 total, 0 running, 374 sleeping, 0 stopped, 0 zombie
Cpu(s): 13.7%us, 3.2%sy, 0.0%ni, 82.7%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st
Mem: 528253632k total, 269537640k used, 258715992k free, 100k buffers
Swap: 0k total, 0k used, 0k free, 227043484k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
73 appdeplo 20 0 30.2g 1.5g 17m S 5.6 0.3 0:24.85 java
74 appdeplo 20 0 30.2g 1.5g 17m S 5.3 0.3 0:20.44 java
71 appdeplo 20 0 30.2g 1.5g 17m S 3.7 0.3 0:17.64 java
68 appdeplo 20 0 30.2g 1.5g 17m S 2.7 0.3 0:22.56 java
70 appdeplo 20 0 30.2g 1.5g 17m S 1.3 0.3 0:21.91 java
78 appdeplo 20 0 30.2g 1.5g 17m S 1.3 0.3 0:02.78 java
79 appdeplo 20 0 30.2g 1.5g 17m S 1.3 0.3 0:02.70 java
82 appdeplo 20 0 30.2g 1.5g 17m S 1.3 0.3 0:02.72 java
69 appdeplo 20 0 30.2g 1.5g 17m S 1.0 0.3 0:25.87 java
75 appdeplo 20 0 30.2g 1.5g 17m S 1.0 0.3 0:21.27 java
76 appdeplo 20 0 30.2g 1.5g 17m S 1.0 0.3 0:20.59 java
80 appdeplo 20 0 30.2g 1.5g 17m S 1.0 0.3 0:02.77 java
...
此时 PID :指的是TID 线程ID
2.1.7 网络流量监控iftop
iftop 直接默认统计eth0网卡信息,-i : 指定网卡,-P显示host信息和端口信息
12.5Kb 25.0Kb 37.5Kb 50.0Kb 62.5Kb
└─────────────────────────────────────────┴─────────────────────────────────────────┴─────────────────────────────────────────┴─────────────────────────────────────────┴──────────────────────────────────────────
1a01vlb9508zzzz => 100.119.92.19 1.03Kb 1.27Kb 1.18Kb
<= 160b 225b 241b
1a01vlb9508zzzz => 10.206.9.180 432b 259b 288b
<= 744b 446b 496b
1a01vlb9508zzzz => 100.80.96.100 0b 173b 572b
<= 0b 300b 1.08Kb
1a01vlb9508zzzz => 10.206.9.149 320b 192b 213b
<= 416b 250b 277b
1a01vlb9508zzzz => 10.203.194.187 0b 194b 162b
<= 0b 164b 137b
1a01vlb9508zzzz => 100.80.96.43 0b 190b 159b
<= 0b 150b 125b
10.206.56.122 => vrrp.mcast.net 160b 160b 160b
<= 0b 0b 0b
10.206.56.119 => vrrp.mcast.net 160b 160b 160b
<= 0b 0b 0b
10.206.56.85 => vrrp.mcast.net 160b 160b 160b
<= 0b 0b 0b
10.206.56.145 => vrrp.mcast.net 160b 160b 160b
<= 0b 0b 0b
1a01vlb9508zzzz => 100.119.92.17 0b 83b 69b
<= 0b 66b 55b
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
TX: cumm: 3.91KB peak: 3.99Kb rates: 1.77Kb 2.33Kb 2.61Kb
RX: 4.51KB 7.09Kb 1.91Kb 2.19Kb 3.01Kb
TOTAL: 8.42KB 11.1Kb 3.68Kb 4.52Kb 5.62Kb
TX:发送总流量 RX:接受总流量 TOTAL 总流量 peak 每秒流量峰值, rates : 过去2s、10s、40s 的平均流量。
iftop 有很多参数可以用, 参考文档
2.1.8 nmon对linux服务的整体性能监控
nmon是一个监控linux服务器性能的免费工具,nmon可以监控的数据主要包括:CPU使用信息,内存使用信息,内核统计信息,运行队列信息,磁盘I/O速率,传输和读写速率,网络I/O速率,消耗资源最多的进程,虚拟内存使用信息等,配合nmon_analyser可以一起把nmon的监控数据转换为excel形式的报表。执行 nmon,进入监控选项视图。
┌nmon─16g─────────────────────Hostname=1a01vlb9508zzRefresh= 2secs ───17:25.10────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ │
│ ------------------------------ │
│ _ __ _ __ ___ ___ _ __ For help type H or ... │
│ | '_ \| '_ ` _ \ / _ \| '_ \ nmon -? - hint │
│ | | | | | | | | | (_) | | | | nmon -h - full details │
│ |_| |_|_| |_| |_|\___/|_| |_| │
│ To stop nmon type q to Quit │
│ ------------------------------ │
│ │
│ CentOS Linux release 7.5.1804 (Core) VERSION="7 (Core)" │
│ Vendor=GenuineIntel Model=Intel Core Processor (Skylake) │
│ MHz=2197.454 bogomips=4394.90 lscpu:CPU=4 Little Endian │
│ ProcessorChips=1 PhyscalCores=1 Sockets=4 Cores=1 Thrds=1 │
│ VirtualCPUs =4 MHz=2197 max=0 min=0 │
│ │
│ Use these keys to toggle statistics on/off: │
│ c = CPU l = CPU Long-term - = Faster screen updates │
│ C = " WideView U = Utilisation + = Slower screen updates │
│ m = Memory V = Virtual memory j = File Systems │
│ d = Disks n = Network . = only busy disks/procs │
│ r = Resource N = NFS h = more options │
│ k = Kernel t = Top-processes q = Quit │
│─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────│
q
: 停止并退出 Nmonh
: 查看帮助c
: 查看 CPU 统计数据m
: 查看内存统计数据d
: 查看硬盘统计数据k
: 查看内核统计数据n
: 查看网络统计数据N
: 查看 NFS 统计数据j
: 查看文件系统统计数据t
: 查看高耗进程V
: 查看虚拟内存统计数据v
: 详细模式
可以通过命令行导出数据,通过数据绘图分析性能; 后续文档专门介绍;
系统内核信息:
RunQueue : 操作系统运行队列的大小
Blocked :
Context Switch :每秒上下文切换次数
Forks:每秒Forks调用次数
Interrupts:每秒CPU中断次数