学习的时候看到的文章,虽然是英文的,但是还是蛮简单的,很容易读懂的,转载过来。留做记录
感觉作者,鄙视下自己,我又无耻的转载了
原文链接http://mylinuxbook.com/linux-strace-command-a-magnificent-troubleshooter/
Trying to run a program but facing weird errors? Well, most of the Linux users might have faced this situation. Many of us either spend hours searching on internet about the problem or wait for days on forums to get a reply. Sometimes we even give up and un-install the program. Very few of us know that there exists a Linux command line utility that can well be a good starting point for debugging these kind of problems. I am talking about the Linux strace command.
Linux strace command
What is strace
The Linux strace command is a utility that can be used to trace down the interaction between a program and Linux kernel. The interaction here means the Linux system calls that the program uses and the Linux signals that the program receives.
From the man page of Linux strace command :
In the simplest case strace runs the specified command/program until it exits. It intercepts and records the system calls which are called by a process and the signals which are received by a process. The name of each system call, its arguments and its return value are printed on standard error or to the file specified with the -o option.
The basic syntax of strace command is :
strace [command or program][arguments]
So we see that the strace command can be run by just typing it’s name along with the program and its arguments from the command line.
Some capabilities of strace command
Here are some of the capabilities of the Linux strace command :
1. A basic example
In its very basic form, the strace command can be used with an executable.
For example :
$ sudo strace ls execve("/bin/ls", ["ls"], [/* 18 vars */]) = 0 brk(0) = 0x8c8f000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb775f000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=71704, ...}) = 0 mmap2(NULL, 71704, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb774d000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/i386-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@A\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=120748, ...}) = 0 mmap2(NULL, 125852, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb772e000 mmap2(0xb774b000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c) = 0xb774b000 close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/i386-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\30\0\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=30684, ...}) = 0 mmap2(NULL, 33360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7725000 mmap2(0xb772c000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6) = 0xb772c000 ... ... ...
In the output above, there is lots of information related to the system calls being used and their return values. You can see that the calls to access system call fail with No such file or directory error. Although this failure does not have much impact on the execution of ls command but this example should give you a good idea as to why Linux strace command is so helpful.
2. Produce a summary report
The strace command has an option -c. Through this option, the strace command counts time, calls, and errors for each system call and reports a summary on program exit.
For example :
$ strace -c ls Desktop Documents Downloads examples.desktop grof groff.txt Music Pictures practice Public Templates Ubuntu One Videos % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- -nan 0.000000 0 9 read -nan 0.000000 0 1 write -nan 0.000000 0 10 open -nan 0.000000 0 13 close -nan 0.000000 0 1 execve -nan 0.000000 0 9 9 access -nan 0.000000 0 3 brk -nan 0.000000 0 2 ioctl -nan 0.000000 0 3 munmap -nan 0.000000 0 1 uname -nan 0.000000 0 10 mprotect -nan 0.000000 0 2 rt_sigaction -nan 0.000000 0 1 rt_sigprocmask -nan 0.000000 0 1 getrlimit -nan 0.000000 0 25 mmap2 -nan 0.000000 0 11 fstat64 -nan 0.000000 0 2 getdents64 -nan 0.000000 0 1 1 futex -nan 0.000000 0 1 set_thread_area -nan 0.000000 0 1 set_tid_address -nan 0.000000 0 1 statfs64 -nan 0.000000 0 1 openat -nan 0.000000 0 1 set_robust_list ------ ----------- ----------- --------- --------- ---------------- 100.00 0.000000 110 10 total
So we see that a summary table is produced in the output that counts the number of times a system call is made, the number of times a particular system call returned error and the timing information for a system call.
3. Prefix each line with time of day
Each line of the strace output can be prefixed with the time of day. This can be done using the -t option.
For example:
~$ strace -t ls 14:05:49 execve("/bin/ls", ["ls"], [/* 38 vars */]) = 0 14:05:49 brk(0) = 0x8b91000 14:05:49 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) 14:05:49 mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb76f4000 14:05:49 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) 14:05:49 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 14:05:49 fstat64(3, {st_mode=S_IFREG|0644, st_size=71704, ...}) = 0 14:05:49 mmap2(NULL, 71704, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb76e2000 14:05:49 close(3) = 0 14:05:49 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) 14:05:49 open("/lib/i386-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 14:05:49 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0@A\0\0004\0\0\0"..., 512) = 512 14:05:49 fstat64(3, {st_mode=S_IFREG|0644, st_size=120748, ...}) = 0 14:05:49 mmap2(NULL, 125852, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb76c3000 14:05:49 mmap2(0xb76e0000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c) = 0xb76e0000 14:05:49 close(3) = 0 14:05:49 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) 14:05:49 open("/lib/i386-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3 14:05:49 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\30\0\0004\0\0\0"..., 512) = 512 14:05:49 fstat64(3, {st_mode=S_IFREG|0644, st_size=30684, ...}) = 0 14:05:49 mmap2(NULL, 33360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb76ba000 14:05:49 mmap2(0xb76c1000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6) = 0xb76c1000 14:05:49 close(3) ... ... ...
In the output above as you can see, all the lines were prefixed with time of the day. Similarly, the options -tt and -ttt can be used to prefix various other time units like microseconds etc.
4. Get the time spent in each system call
If you are a developer of a program and want to know the time spent in each system call, then the -T option can be used with Linux strace command.
For example :
$ strace -T ls execve("/bin/ls", ["ls"], [/* 38 vars */]) = 0 <0.000399> brk(0) = 0x84f3000 <0.000036> access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000047> mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77a6000 <0.000063> access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.000066> ... ... ...
So we see that time spent in each system call is printed at the end of each line.
5. Trace only specified system calls
This is an extremely useful feature of this command. Through this one can trace only those sytem calls that are required. This makes the analysis fast and easy. This is be achived using the ‘-e trace’ option.
For example :
$ strace -e trace=open ls open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 open("/lib/i386-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 open("/lib/i386-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3 open("/lib/i386-linux-gnu/libacl.so.1", O_RDONLY|O_CLOEXEC) = 3 open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 open("/lib/i386-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 open("/lib/i386-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3 open("/lib/i386-linux-gnu/libattr.so.1", O_RDONLY|O_CLOEXEC) = 3 open("/proc/filesystems", O_RDONLY|O_LARGEFILE) = 3 open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 ... ... ...
So we see that all the open system calls were traced.
6. Redirect the output to a file
Usually the output of strace command is a bit messy and that is the reason why mostly it is preferred to have the output in a text file so that detailed analysis can be done easily. This is possible using the -o option.
For example :
$ strace -o strace.out -T ls Desktop Downloads grof Music practice strace.out Ubuntu One Documents examples.desktop groff.txt Pictures Public Templates Videos $ cat strace.out execve("/bin/ls", ["ls"], [/* 38 vars */]) = 0 brk(0) = 0x8115000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77b1000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
So we see that all the strace output was redirected to the file strace.out and the ls command output was produced on stdout.
These are only some of the capabilities of strace command. There are lots and lots of other options in the man page of Linux strace command.
Can I use strace
Yes. Any one using Linux can use this command. It doesn’t really matter if you are a developer, system administrator, network administrator or just an end user. As an when you face a problem and think that strace can be helpful you must use this command.
System administrators can use this command to interrogate what a command or program is trying to access internally. This helps them to keep malicious programs at bay.
Developers can use this command to check the time spent by their program in system calls. This helps them to optimize their code. Also they can use Linux strace command to study the programs for which the source code is not available. For study purposes there is another friend of strace, the ltrace command, that can be used.
End users can use this command in case they are facing any issue related to a program which is not executing properly. Even if they are not able to understand the output of strace, at least they can share this output with the experienced people on forums and can get their answers fast.
A practical example
In this section, I will present a practical examples to give you an idea how magnificent troubleshooter the Linux strace command is.
Example
In this example, I will show you how one can use strace to debug a problem. Note that I have simplified the problem to give you an idea about how it can be used to solve a practical development related problem.
Consider the following code :
#include<stdio.h> int main(void) { if(NULL == fopen("MyLinuxBook","rw")) { printf("\n program failed\n"); } else { printf("\n program successful\n"); } return 0; }
The program above tries to open a file named ‘MyLinuxBook’. If file opening fails then program fails else program is successful. Now suppose that you do not have this code with you and you just try to execute the program binary :
$ ./test_strace program failed
Now you see the output says that the program failed but it does not give you an idea as to why the program failed?
Lets use strace command to debug this :
$ strace ./test_strace
execve("./test_strace", ["./test_strace"], [/* 39 vars */]) = 0
brk(0) = 0x8a8b000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7768000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=71704, ...}) = 0
mmap2(NULL, 71704, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7756000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\226\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0
mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75ac000
mprotect(0xb774f000, 4096, PROT_NONE) = 0
mmap2(0xb7750000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7750000
mmap2(0xb7753000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7753000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75ab000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb75ab900, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
mprotect(0xb7750000, 8192, PROT_READ) = 0
mprotect(0x8049000, 4096, PROT_READ) = 0
mprotect(0xb778b000, 4096, PROT_READ) = 0
munmap(0xb7756000, 71704) = 0
brk(0) = 0x8a8b000
brk(0x8aac000) = 0x8aac000
open("MyLinuxBook", O_RDONLY) = -1 ENOENT (No such file or directory)
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7767000
write(1, "\n", 1
) = 1
write(1, " program failed\n", 16 program failed
) = 16
exit_group(0)
When you analyse the output of the strace command, you will see that the open() system call failed to open file ‘MyLinuxBook’ with error ‘No such file or directory’. So now you check the file ‘MyLinuxBook’ in current directory and you will see that because this file does not exist so the program failed.
This was a small demo as to how strace helps to debug problems. Many complex problems can be sorted out using Linux strace command.