利用strace探测cp命令一次拷多少字节

文章讲述了作者使用strace观察cp命令拷贝文件时的系统调用,发现默认一次拷贝131072字节,且cp源代码中固定了IO_BUFSIZE为128KB。尝试使用buffer命令优化,但效果不明显,最后揭示了cp新版本通过copy_file_range实现的更快性能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

前言

写完strace2五花八门客户问题(BUG) - 用好strace2strace2后,突然想到“cp命令一次拷多少字节”这么个问题,正好用strace看一看。

试验

[mzhai@qasdevmvasmin02 tmp]$ ls -l output_file
-rw-r-----. 1 mzhai abp 1095319552 Dec 10 02:07 output_file
[root@qasdevmvasmin02 tmp]# strace cp output_file output_file2
...
read(3, "\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1"..., 131072) = 131072
write(4, "\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1"..., 131072) = 131072
...

肉眼可见大量的read、write调用,除了最后一次字节数不是 131072外都是131072(0x20000).

所以一次拷贝就是0x20000个字节,直到拷贝完成。

[mzhai]$ strace -c cp output_file output_file2
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 52.15    0.496419          59      8357           write
 41.42    0.394261          47      8376           read
[mzhai]$ ls -l output_file
-rw-r-----. 1 mzhai abp 1095319552 Dec 10 02:07 output_file
[mzhai@qasdevmvasmin02 tmp]$ python
Python 3.6.8 (default, Jun 22 2023, 07:44:04)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-18)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 1095319552/131072
8356.625

 用-c统计一共调用了8357次read、write,一次131072个字节,正好与文件大小基本匹配。

这个值实际是cp源代码定死的:

// https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/ioblksize.h
enum { IO_BUFSIZE = 128 * 1024 };

加快copy

一次拷贝字节数能优化吗?

查找cp的option,没有发现这样的参数。所以无法传一个大于128*1024的值。

但是有一个command叫buffer可以达到这样的目的

BUFFER(1)                                        General Commands Manual                                        BUFFER(1)

NAME
       buffer - very fast reblocking program

SYNTAX
       buffer  [-S  size]  [-b blocks] [-s size] [-z size] [-m size] [-p percentage] [-u microseconds] [-B] [-t] [-Z] [-i
       filename] [-o filename] [-d]

OPTIONS
       -i filename
            Use the given file as the input file.  The default is stdin.

       -o filename
            Use the given file as the output file.  The default is stdout.

       -S size
            After every chunk of this size has been written, print out how much has been written so far. Also prints  the
            total throughput.  By default this is not set.

       -s size
            Size in bytes of each block.  The default blocksize is 10k to match the normal output of the tar(1) program.

       -z size
            Combines the -S and -s flags.

       -b blocks
            Number  of  blocks  to allocate to shared memory circular buffer.  Defaults to the number required to fill up
            the shared memory requested.

但是实际测试并没有起效:

mzhai$ time cp output_file  output_file2

real    0m3.217s
user    0m0.000s
sys     0m1.042s
mzhai$ time cp output_file  output_file2

real    0m2.904s
user    0m0.000s
sys     0m1.027s
mzhai$ time cp output_file  output_file2

real    0m2.087s
user    0m0.000s
sys     0m0.898s
mzhai$ time cp output_file  output_file2

real    0m2.135s
user    0m0.004s
sys     0m0.919s
mzhai$ time cp output_file  output_file2

real    0m1.669s
user    0m0.000s
sys     0m0.881s
mzhai$ time cp output_file  output_file2

real    0m1.618s
user    0m0.004s
sys     0m0.851s
mzhai$ time cp output_file  output_file2

real    0m1.615s
user    0m0.000s
sys     0m0.861s
mzhai$ time cp output_file  output_file2

real    0m1.643s
user    0m0.008s
sys     0m0.850s
mzhai$ time cp output_file  output_file2

real    0m1.567s
user    0m0.004s
sys     0m0.856s
mzhai$ time cp output_file  output_file2

real    0m1.623s
user    0m0.009s
sys     0m0.837s
mzhai$ time cp output_file  output_file2

real    0m1.622s
user    0m0.000s
sys     0m0.850s
mzhai$ buffer -s 128k -m 1M < input_file > output_file
^C
mzhai$ time cp output_file  outpu^C
mzhai$ buffer -s 128k -m 1M < output_file > output_file2
mzhai$ time buffer -s 128k -m 1M < output_file > output_file2

real    0m1.678s
user    0m0.012s
sys     0m0.893s
mzhai$ time buffer -s 128k -m 1M < output_file > output_file2

real    0m1.690s
user    0m0.000s
sys     0m0.908s
mzhai$ time buffer -s 128k -m 1M < output_file > output_file2

real    0m1.749s
user    0m0.012s
sys     0m1.011s
mzhai$ time buffer -s 512k -m 1M < output_file > output_file2

real    0m1.538s
user    0m0.008s
sys     0m0.802s
mzhai$ time buffer -s 512k -m 1M < output_file > output_file2

real    0m1.722s
user    0m0.000s
sys     0m0.972s
mzhai$ time buffer -s 512k -m 1M < output_file > output_file2

real    0m1.570s
user    0m0.000s
sys     0m0.826s
mzhai$ time buffer -s 512k -m 1M < output_file > output_file2

real    0m1.613s
user    0m0.000s
sys     0m0.914s

也许cp使用128K的buffer已经很快了(请看别人的blog里面有数据),也许copy数据量太少没显出buffer的威力?以后在研究。

从代码编译cp

我在ubantu上废了不少力气编译了coreutils, 然后strace它,发现最新的cp直接调用了copy_file_range

mzhai:/dev/coreutils$ strace ./src/cp ~/output_file ~/output_file2
...
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
uname({sysname="Linux", nodename="qasdevmvasmin03", ...}) = 0
copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = 1073741824
copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = 0
close(4)                                = 0
close(3)                                = 0
...

mzhai:/dev/coreutils$ man copy_file_range 
COPY_FILE_RANGE(2)                              Linux Programmer's Manual                              COPY_FILE_RANGE(2)

NAME
       copy_file_range - Copy a range of data from one file to another

SYNOPSIS
       #define _GNU_SOURCE
       #include <unistd.h>

       ssize_t copy_file_range(int fd_in, loff_t *off_in,
                               int fd_out, loff_t *off_out,
                               size_t len, unsigned int flags);

DESCRIPTION
       The  copy_file_range()  system call performs an in-kernel copy between two file descriptors without the additional
       cost of transferring data from the kernel to user space and then back into the kernel.  It copies up to len  bytes
       of  data from the source file descriptor fd_in to the target file descriptor fd_out, overwriting any data that ex‐
       ists within the requested range of the target file.

比较performance,新版cp稍稍快那么一点点

mzhai:/dev/coreutils$ time ./src/cp ~/output_file ~/output_file2

real    0m1.620s
user    0m0.000s
sys     0m0.864s
mzhai:/dev/coreutils$ time ./src/cp ~/output_file ~/output_file2

real    0m1.560s
user    0m0.000s
sys     0m0.784s
mzhai:/dev/coreutils$ time ./src/cp ~/output_file ~/output_file2

real    0m1.492s
user    0m0.000s
sys     0m0.744s
mzhai:/dev/coreutils$ time ./src/cp ~/output_file ~/output_file2

real    0m1.526s
user    0m0.000s
sys     0m0.742s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2

real    0m1.643s
user    0m0.000s
sys     0m0.906s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2

real    0m1.600s
user    0m0.000s
sys     0m0.851s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2

real    0m1.564s
user    0m0.000s
sys     0m0.853s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2

real    0m1.580s
user    0m0.000s
sys     0m0.848s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2

real    0m1.533s
user    0m0.000s
sys     0m0.835s
mzhai:/dev/coreutils$

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

深山老宅

鸡蛋不错的话,要不要激励下母鸡

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值