ioprofile

http://aspersa.googlecode.com/svn/html/ioprofile.html

 

The ioprofile tool captures a process's I/O activity through lsof and strace and summarizes it. The result is a tabular display that shows you where the process spent its time on I/O operations. It performs a cross-tabulation ("pivot table") on the I/O operations to summarize them.

You might need to be careful with this tool. strace is generally safe, but there is always a chance of a bug that could cause problems with the process you're profiling. It's also possible in some cases for strace to add a lot of overhead to the process you're tracing. The faster the I/O normally is, the larger the relative overhead can be, so you might see this more dramatically on a database on a FusionIO card, for example.

Command-Line Options and Environment Variables

The tool has the following command-line options, which must come first on the command-line, before any filenames:

-a FUNCTION
Specifies the aggregation function to perform on each cell of the tabular output. By default, it is 'sum', but 'avg' is also available.
-b BINARY
Specifies the name of a process to trace and summarize. The default value is 'mysqld'.
-c CELL
Specifies what value to place into the cells of the tabular output. By default, it is 'times', which means that the cells contain the timing information about I/O operations. You can specify 'count' for a simple count, and 'sizes' for the size of the operations, in bytes.
-g GROUPBY
Specifies the item by which the I/O operations are aggregated. By default, they are aggregated by 'filename'. You can aggregate them by 'pid' to get a per-thread view of the I/O, and by 'all' to get an overall view.
-k KEEPFILE
Specifies a file to hold the strace data. The specified file will not be removed when the program finishes, so you can re-analyze it if you wish.
-p PID
Specifies a process ID to trace and summarize. Causes -b to be ignored.
-s SLEEPTIME
Specifies how long to profile. Default value is 30.

Any additional arguments on the command-line are treated as file names containing the results of lsof + strace data previously gathered, which is to be summarized. In this case, the tool doesn't gather any traces, but merely processes the ones you give it.

How it Works

The ioprofile tool begins by capturing a single sample of lsof output, which identifies the profiled process's file descriptors and the corresponding filenames. It then starts strace and waits the specified amount of time, after which it stops strace and processes the results.

It's important to know how strace is stopped, because processes that are being traced can be in a delicate state. System calls can be interrupted when the trace is started, for example. And processes that are being traced have funny signal handling semantics. So ioprofile starts strace in the background, waits the specified time, and then kills strace with first a SIGINT (because that's what a CTRL-C at the terminal would normally do, and that's how strace is normally stopped when it's run interactively), and then a SIGTERM. It then sends a SIGCONT to the process that was being traced, because in some cases it may be in a stopped state after strace exits. It is not clear whether this is fully reliable and safe. If you know a better way to do this, please share your knowledge.

After the trace is complete, ioprofile processes the results by parsing through the output and converting it into an intermediate format that's easier to manipulate: one line of output per function call that was captured by strace, which has information such as the process ID, size, elapsed time, and filename of the call. The filenames are initially gathered from lsof, and correlated with file descriptor numbers; thereafter, any new files the process opens will be possible to correlate by looking at the arguments to the open system call and gathering the filename from that.

After making the information into an easy-to-process format, ioprofile passes it through an aggregator, which is sort of the equivalent of an SQL GROUP BY query. By default, it aggregates the calls by filename, with one column per function, and the sum of the elapsed time in the cells.

ioprofile aggregates a list of specific I/O calls. This list is hard-coded into the tool, and is currently any call that matches the regular expression /read|write|sync|open|close|getdents|seek/. If this list needs to be expanded, please file a bug report.

Example Usage

Here is an example of the tool's default output, on a sample file that you can find in the Subversion repository:

$ ioprofile t/samples/ioprofile-001.txt total pread read pwrite write filename 10.094264 10.094264 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_228.ibd 8.356632 8.356632 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_223.ibd 0.048850 0.046989 0.000000 0.001861 0.000000 /data/data/abd/aia_instances.ibd 0.035016 0.031001 0.000000 0.004015 0.000000 /data/data/abd/vo_difuus.ibd 0.013360 0.000000 0.001723 0.000000 0.011637 /var/log/mysql/mysql-relay.002113 0.008676 0.000000 0.000000 0.000000 0.008676 /data/data/master.info 0.002060 0.000000 0.000000 0.002060 0.000000 /data/data/ibdata1 0.001490 0.000000 0.000000 0.001490 0.000000 /data/data/ib_logfile1 0.000555 0.000000 0.000000 0.000000 0.000555 /var/log/mysql/mysql-relay-log.info 0.000141 0.000000 0.000000 0.000141 0.000000 /data/data/ib_logfile0 0.000100 0.000000 0.000000 0.000100 0.000000 /data/data/abd/9fvus.ibd

This output is sorted in descending order by the leftmost column. It should be fairly self-explanatory. Let's see a few different ways we can process the same dataset. Let's aggregate by count of operations instead of by elapsed time, so we can see how many times each function was executed on each file:

$ ioprofile -c count t/samples/ioprofile-001.txt total pread read pwrite write filename 4282 4282 0 0 0 /data/data/abd_2dia/aia_227_223.ibd 2713 2713 0 0 0 /data/data/abd_2dia/aia_227_228.ibd 390 0 47 0 343 /var/log/mysql/mysql-relay.002113 343 0 0 0 343 /data/data/master.info 30 8 0 22 0 /data/data/abd/vo_difuus.ibd 19 7 0 12 0 /data/data/abd/aia_instances.ibd 16 0 0 16 0 /data/data/ib_logfile1 16 0 0 0 16 /var/log/mysql/mysql-relay-log.info 6 0 0 6 0 /data/data/ibdata1 1 0 0 1 0 /data/data/ib_logfile0 1 0 0 1 0 /data/data/abd/9fvus.ibd

Interesting that the #1 time consumer isn't the #1 in terms of number of operations, isn't it? We can re-examine the times, showing the average time per call instead of the sum of times:

$ ioprofile -a avg t/samples/ioprofile-001.txt total pread read pwrite write filename 0.003721 0.003721 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_228.ibd 0.002571 0.006713 0.000000 0.000155 0.000000 /data/data/abd/aia_instances.ibd 0.001952 0.001952 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_223.ibd 0.001167 0.003875 0.000000 0.000182 0.000000 /data/data/abd/vo_difuus.ibd 0.000343 0.000000 0.000000 0.000343 0.000000 /data/data/ibdata1 0.000141 0.000000 0.000000 0.000141 0.000000 /data/data/ib_logfile0 0.000100 0.000000 0.000000 0.000100 0.000000 /data/data/abd/9fvus.ibd 0.000093 0.000000 0.000000 0.000093 0.000000 /data/data/ib_logfile1 0.000035 0.000000 0.000000 0.000000 0.000035 /var/log/mysql/mysql-relay-log.info 0.000034 0.000000 0.000037 0.000000 0.000034 /var/log/mysql/mysql-relay.002113 0.000025 0.000000 0.000000 0.000000 0.000025 /data/data/master.info

Another way to aggregate the data is to look at the size of the operations (in bytes), rather than the elapsed time:

$ ioprofile -c sizes t/samples/ioprofile-001.txt total pread read pwrite write filename 90800128 90800128 0 0 0 /data/data/abd_2dia/aia_227_223.ibd 52150272 52150272 0 0 0 /data/data/abd_2dia/aia_227_228.ibd 999424 0 0 999424 0 /data/data/ibdata1 638976 131072 0 507904 0 /data/data/abd/vo_difuus.ibd 327680 114688 0 212992 0 /data/data/abd/aia_instances.ibd 305263 0 149662 0 155601 /var/log/mysql/mysql-relay.002113 217088 0 0 217088 0 /data/data/ib_logfile1 22638 0 0 0 22638 /data/data/master.info 16384 0 0 16384 0 /data/data/abd/9fvus.ibd 1088 0 0 0 1088 /var/log/mysql/mysql-relay-log.info 512 0 0 512 0 /data/data/ib_logfile0

It's also possible to report by process ID (thread ID), instead of by filename:

$ ioprofile -g pid t/samples/ioprofile-001.txt total pread read pwrite write pid 9.580759 9.580759 0.000000 0.000000 0.000000 22782 8.187935 8.187935 0.000000 0.000000 0.000000 20974 0.300581 0.300581 0.000000 0.000000 0.000000 2370 0.181209 0.181209 0.000000 0.000000 0.000000 2369 0.088197 0.088197 0.000000 0.000000 0.000000 2366 0.081061 0.077990 0.001723 0.000793 0.000555 10013 0.038928 0.038928 0.000000 0.000000 0.000000 2373 0.036679 0.036679 0.000000 0.000000 0.000000 2372 0.020577 0.020577 0.000000 0.000000 0.000000 2371 0.020313 0.000000 0.000000 0.000000 0.020313 10012 0.010502 0.010502 0.000000 0.000000 0.000000 2368 0.005529 0.005529 0.000000 0.000000 0.000000 2367 0.002172 0.000000 0.000000 0.002172 0.000000 2375 0.002020 0.000000 0.000000 0.002020 0.000000 2374 0.001923 0.000000 0.000000 0.001923 0.000000 2385 0.001636 0.000000 0.000000 0.001636 0.000000 2377 0.000982 0.000000 0.000000 0.000982 0.000000 2378 0.000141 0.000000 0.000000 0.000141 0.000000 2365

Finally, you can aggregate by the entire dataset, so you simply get the function calls and the desired statistic, not broken out by filename or thread ID:

$ ioprofile -g all t/samples/ioprofile-001.txt 18.561144 TOTAL 18.528886 pread 0.020868 write 0.009667 pwrite 0.001723 read

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
蛋白质是生物体中普遍存在的一类重要生物大分子,由天然氨基酸通过肽键连接而成。它具有复杂的分子结构和特定的生物功能,是表达生物遗传性状的一类主要物质。 蛋白质的结构可分为四级:一级结构是组成蛋白质多肽链的线性氨基酸序列;二级结构是依靠不同氨基酸之间的C=O和N-H基团间的氢键形成的稳定结构,主要为α螺旋和β折叠;三级结构是通过多个二级结构元素在三维空间的排列所形成的一个蛋白质分子的三维结构;四级结构用于描述由不同多肽链(亚基)间相互作用形成具有功能的蛋白质复合物分子。 蛋白质在生物体内具有多种功能,包括提供能量、维持电解质平衡、信息交流、构成人的身体以及免疫等。例如,蛋白质分解可以为人体提供能量,每克蛋白质能产生4千卡的热能;血液里的蛋白质能帮助维持体内的酸碱平衡和血液的渗透压;蛋白质是组成人体器官组织的重要物质,可以修复受损的器官功能,以及维持细胞的生长和更新;蛋白质也是构成多种生理活性的物质,如免疫球蛋白,具有维持机体正常免疫功能的作用。 蛋白质的合成是指生物按照从脱氧核糖核酸(DNA)转录得到的信使核糖核酸(mRNA)上的遗传信息合成蛋白质的过程。这个过程包括氨基酸的活化、多肽链合成的起始、肽链的延长、肽链的终止和释放以及蛋白质合成后的加工修饰等步骤。 蛋白质降解是指食物中的蛋白质经过蛋白质降解酶的作用降解为多肽和氨基酸然后被人体吸收的过程。这个过程在细胞的生理活动中发挥着极其重要的作用,例如将蛋白质降解后成为小分子的氨基酸,并被循环利用;处理错误折叠的蛋白质以及多余组分,使之降解,以防机体产生错误应答。 总的来说,蛋白质是生物体内不可或缺的一类重要物质,对于维持生物体的正常生理功能具有至关重要的作用。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值