linux下定位程序问题,linux系统常用定位问题总结

1:查看CPU负载--mpstat

mpstat -P ALL [internal [count]]

参数的含义如下:

-P ALL 表示监控所有CPU

internal 相邻的两次采样的间隔时间

count 采样的次数

mpstat命令从/proc/stat获得数据输出

输出的含义如下:

CPU 处理器ID

user 在internal时间段里,用户态的CPU时间(%) ,不包含 nice值为负 进程

?usr/?total*100

nice 在internal时间段里,nice值为负进程的CPU时间(%) ?nice/?total*100

system 在internal时间段里,核心时间(%) ?system/?total*100

iowait 在internal时间段里,硬盘IO等待时间(%) ?iowait/?total*100

irq 在internal时间段里,软中断时间(%) ?irq/?total*100

soft 在internal时间段里,软中断时间(%) ?softirq/?total*100

idle 在internal时间段里,CPU除去等待磁盘IO操作外的因为任何原因而空闲的时间闲置时间 (%)

?idle/?total*100

intr/s 在internal时间段里,每秒CPU接收的中断的次数 ?intr/?total*100

CPU总的工作时间total_cur=user+system+nice+idle+iowait+irq+softirq

total_pre=pre_user+ pre_system+ pre_nice+ pre_idle+

pre_iowait+ pre_irq+ pre_softirq

user=user_cur – user_pre

total=total_cur-total_pre

其中_cur 表示当前值,_pre表示interval时间前的值。上表中的所有值可取到两位小数点。

2:查看磁盘io情况及CPU负载--vmstat

usage: vmstat [-V] [-n] [delay [count]]

-V prints version.

-n causes the headers not to

be reprinted regularly.

-a print inactive/active page

stats.

-d prints disk

statistics

-D prints disk table

-p prints disk partition

statistics

-s prints vm table

-m prints slabinfo

-S unit size

delay is the delay between

updates in seconds.

unit size k:1000 K:1024

m:1000000 M:1048576 (default is K)

count is the number of

updates.

vmstat从/proc/stat获得数据

输出的含义如下:

FIELD DESCRIPTION FOR VM MODE

Procs

r: The number of processes waiting for run

time.

b: The number of processes in uninterruptible

sleep.

Memory

swpd: the amount of virtual memory used.

free: the amount of idle memory.

buff: the amount of memory used as

buffers.

cache: the amount of memory used as cache.

inact: the amount of inactive memory. (-a

option)

active: the amount of active memory. (-a

option)

Swap

si: Amount of memory swapped in from disk

(/s).

so: Amount of memory swapped to disk (/s).

IO

bi: Blocks received from a block device

(blocks/s).

bo: Blocks sent to a block device

(blocks/s).

System

in: The number of interrupts per second,

including the clock.

cs: The number of context switches per

second.

CPU

These are percentages of total CPU time.

us: Time spent running non-kernel code. (user

time, including nice time)

sy: Time spent running kernel code. (system

time)

id: Time spent idle. Prior to Linux 2.5.41, this

includes IO-wait time.

wa: Time spent waiting for IO. Prior to Linux

2.5.41, shown as zero.

st: Time spent in involuntary wait. Prior to

Linux 2.6.11, shown as zero.

3:查看内存使用情况--free

usage: free [-b|-k|-m|-g] [-l] [-o] [-t] [-s delay] [-c count]

[-V]

-b,-k,-m,-g show output in

bytes, KB, MB, or GB

-l show detailed low and

high memory statistics

-o use old format (no

-/+buffers/cache line)

-t display total for RAM +

swap

-s update every [delay]

seconds

-c update [count]

times

-V display version

information and exit

[root@Linux /tmp]# free

total  used

free  shared  buffers  cached

Mem:  255268  238332

16936

0  85540  126384

-/+ buffers/cache:  26408  228860

Swap:  265000  0  265000

Mem:表示物理内存统计

-/+ buffers/cached:表示物理内存的缓存统计

Swap:表示硬盘上交换分区的使用情况,这里我们不去关心。

系统的总物理内存:255268Kb(256M),但系统当前真正可用的内存b并不是第一行free 标记的

16936Kb,它仅代表未被分配的内存。

第1行  Mem:

total:表示物理内存总量。

used:表示总计分配给缓存(包含buffers 与cache

)使用的数量,但其中可能部分缓存并未实际使用。

free:未被分配的内存。

shared:共享内存,一般系统不会用到,这里也不讨论。

buffers:系统分配但未被使用的buffers 数量。

cached:系统分配但未被使用的cache 数量。buffer 与cache

的区别见后面。

total = used + free

第2行  -/+ buffers/cached:

used:也就是第一行中的used - buffers-cached  也是实际使用的内存总量。

free:未被使用的buffers 与cache 和未被分配的内存之和,这就是系统当前实际可用内存。

free 2= buffers1 + cached1 + free1  //free2为第二行、buffers1等为第一行

buffer 与cache 的区别

A buffer is something that has yet to be "written" to

disk.

A cache is something that has been "read" from the disk and

stored for later use

第3行:

对操作系统来讲是Mem的参数.buffers/cached 都是属于被使用,所以它认为free只有16936.

对应用程序来讲是(-/+ buffers/cach).buffers/cached

是等同可用的,因为buffer/cached是为了提高文件读取的性能,当应用程序需在用到内存的时候,buffer/cached会很快地被回收。

所以从应用程序的角度来说,可用内存=系统free memory+buffers+cached.

swap

swap就是LINUX下的虚拟内存分区,它的作用是在物理内存使用完之后,将磁盘空间(也就是SWAP分区)虚拟成内存来使用.

4:查看网卡情况--sar

详细见man

4.1:查看网卡流量:sar -n DEV delay count

服务器网卡最大能承受流量由网卡本身决定,分为10M、10/100自适应、100+以及1G网卡,一般普通服务器用的是百兆,也有用千兆的。

输出解释:

IFACE

Name of the network interface for which

statistics are reported.

rxpck/s

Total number of packets received per

second.

txpck/s

Total number of packets transmitted per

second.

rxbyt/s

Total number of bytes received per second.

txbyt/s

Total number of bytes transmitted per

second.

rxcmp/s

Number of compressed packets received per second

(for cslip etc.).

txcmp/s

Number of compressed packets transmitted per

second.

rxmcst/s

Number of multicast packets received per

second.

4.2:查看网卡失败情况:sar -n EDEV delay count

输出解释:

IFACE

Name of the network interface for which

statistics are reported.

rxerr/s

Total number of bad packets received per

second.

txerr/s

Total number of errors that happened per second

while transmitting packets.

coll/s

Number of collisions that happened per second

while transmitting packets.

rxdrop/s

Number of received packets dropped per second

because of a lack of space in linux buffers.

txdrop/s

Number of transmitted packets dropped per second

because of a lack of space in linux buffers.

txcarr/s

Number of carrier-errors that happened per

second while transmitting packets.

rxfram/s

Number of frame alignment errors that happened

per second on received packets.

rxfifo/s

Number of FIFO overrun errors that happened per

second on received packets.

txfifo/s

Number of FIFO overrun errors that happened per

second on transmitted packets.

5:定位问题进程--top, ps

top -d delay,详细见man

ps aux 查看进程详细信息

ps axf 查看进程树

6:查看某个进程与文件关系--losf

需要root权限才能看到全部,否则只能看到登录用户权限范围内的内容

lsof -p 77//查看进程号为77的进程打开了哪些文件

lsof -d 4//显示使用fd为4的进程

lsof abc.txt//显示开启文件abc.txt的进程

lsof -i :22//显示使用22端口的进程

lsof -i tcp//显示使用tcp协议的进程

lsof -i tcp:22//显示使用tcp协议的22端口的进程

lsof +d /tmp//显示目录/tmp下被进程打开的文件

lsof +D /tmp//同上,但是会搜索目录下的目录,时间较长

lsof -u username//显示所属user进程打开的文件

7:查看程序运行情况--strace

usage: strace [-dffhiqrtttTvVxx] [-a column] [-e expr] ... [-o

file]

[-p pid] ... [-s strsize] [-u

username] [-E var=val] ...

[command [arg ...]]

or: strace -c [-e expr]

... [-O overhead] [-S sortby] [-E var=val] ...

[command [arg ...]]

常用选项:

-f:除了跟踪当前进程外,还跟踪其子进程。

-c:统计每一系统调用的所执行的时间,次数和出错的次数等.

-o file:将输出信息写到文件file中,而不是显示到标准错误输出(stderr)。

-p pid:绑定到一个由pid对应的正在运行的进程。此参数常用来调试后台进程。

8:查看磁盘使用情况--df

test@wolf:~$ df

Filesystem  1K-blocks

Used

Available Use% Mounted on

/dev/sda1  3945128  1810428  1934292  49% /

udev  745568  80  745488  1% /dev

/dev/sda3  12649960  1169412

10837948  10% /usr/local

/dev/sda4  63991676  23179912

37561180  39% /data

9:查看网络连接情况--netstat

常用:netstat -lpn

选项说明:

-p, --programs  display PID/Program name for sockets

-l, --listening  display listening server sockets

-n, --numeric  don't resolve names

-a, --all, --listening  display all sockets (default: connected)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值