前言
写完strace2《五花八门客户问题(BUG) - 用好strace2》strace2后,突然想到“cp命令一次拷多少字节”这么个问题,正好用strace看一看。
试验
[mzhai@qasdevmvasmin02 tmp]$ ls -l output_file
-rw-r-----. 1 mzhai abp 1095319552 Dec 10 02:07 output_file
[root@qasdevmvasmin02 tmp]# strace cp output_file output_file2
...
read(3, "\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1"..., 131072) = 131072
write(4, "\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1"..., 131072) = 131072
...
肉眼可见大量的read、write调用,除了最后一次字节数不是 131072外都是131072(0x20000).
所以一次拷贝就是0x20000个字节,直到拷贝完成。
[mzhai]$ strace -c cp output_file output_file2
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
52.15 0.496419 59 8357 write
41.42 0.394261 47 8376 read
[mzhai]$ ls -l output_file
-rw-r-----. 1 mzhai abp 1095319552 Dec 10 02:07 output_file
[mzhai@qasdevmvasmin02 tmp]$ python
Python 3.6.8 (default, Jun 22 2023, 07:44:04)
[GCC 8.5.0 20210514 (Red Hat 8.5.0-18)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 1095319552/131072
8356.625
用-c统计一共调用了8357次read、write,一次131072个字节,正好与文件大小基本匹配。
这个值实际是cp源代码定死的:
// https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/ioblksize.h
enum { IO_BUFSIZE = 128 * 1024 };
加快copy
一次拷贝字节数能优化吗?
查找cp的option,没有发现这样的参数。所以无法传一个大于128*1024的值。
但是有一个command叫buffer可以达到这样的目的
BUFFER(1) General Commands Manual BUFFER(1)
NAME
buffer - very fast reblocking program
SYNTAX
buffer [-S size] [-b blocks] [-s size] [-z size] [-m size] [-p percentage] [-u microseconds] [-B] [-t] [-Z] [-i
filename] [-o filename] [-d]
OPTIONS
-i filename
Use the given file as the input file. The default is stdin.
-o filename
Use the given file as the output file. The default is stdout.
-S size
After every chunk of this size has been written, print out how much has been written so far. Also prints the
total throughput. By default this is not set.
-s size
Size in bytes of each block. The default blocksize is 10k to match the normal output of the tar(1) program.
-z size
Combines the -S and -s flags.
-b blocks
Number of blocks to allocate to shared memory circular buffer. Defaults to the number required to fill up
the shared memory requested.
但是实际测试并没有起效:
mzhai$ time cp output_file output_file2
real 0m3.217s
user 0m0.000s
sys 0m1.042s
mzhai$ time cp output_file output_file2
real 0m2.904s
user 0m0.000s
sys 0m1.027s
mzhai$ time cp output_file output_file2
real 0m2.087s
user 0m0.000s
sys 0m0.898s
mzhai$ time cp output_file output_file2
real 0m2.135s
user 0m0.004s
sys 0m0.919s
mzhai$ time cp output_file output_file2
real 0m1.669s
user 0m0.000s
sys 0m0.881s
mzhai$ time cp output_file output_file2
real 0m1.618s
user 0m0.004s
sys 0m0.851s
mzhai$ time cp output_file output_file2
real 0m1.615s
user 0m0.000s
sys 0m0.861s
mzhai$ time cp output_file output_file2
real 0m1.643s
user 0m0.008s
sys 0m0.850s
mzhai$ time cp output_file output_file2
real 0m1.567s
user 0m0.004s
sys 0m0.856s
mzhai$ time cp output_file output_file2
real 0m1.623s
user 0m0.009s
sys 0m0.837s
mzhai$ time cp output_file output_file2
real 0m1.622s
user 0m0.000s
sys 0m0.850s
mzhai$ buffer -s 128k -m 1M < input_file > output_file
^C
mzhai$ time cp output_file outpu^C
mzhai$ buffer -s 128k -m 1M < output_file > output_file2
mzhai$ time buffer -s 128k -m 1M < output_file > output_file2
real 0m1.678s
user 0m0.012s
sys 0m0.893s
mzhai$ time buffer -s 128k -m 1M < output_file > output_file2
real 0m1.690s
user 0m0.000s
sys 0m0.908s
mzhai$ time buffer -s 128k -m 1M < output_file > output_file2
real 0m1.749s
user 0m0.012s
sys 0m1.011s
mzhai$ time buffer -s 512k -m 1M < output_file > output_file2
real 0m1.538s
user 0m0.008s
sys 0m0.802s
mzhai$ time buffer -s 512k -m 1M < output_file > output_file2
real 0m1.722s
user 0m0.000s
sys 0m0.972s
mzhai$ time buffer -s 512k -m 1M < output_file > output_file2
real 0m1.570s
user 0m0.000s
sys 0m0.826s
mzhai$ time buffer -s 512k -m 1M < output_file > output_file2
real 0m1.613s
user 0m0.000s
sys 0m0.914s
也许cp使用128K的buffer已经很快了(请看别人的blog里面有数据),也许copy数据量太少没显出buffer的威力?以后在研究。
从代码编译cp
我在ubantu上废了不少力气编译了coreutils, 然后strace它,发现最新的cp直接调用了copy_file_range
mzhai:/dev/coreutils$ strace ./src/cp ~/output_file ~/output_file2
...
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
uname({sysname="Linux", nodename="qasdevmvasmin03", ...}) = 0
copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = 1073741824
copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = 0
close(4) = 0
close(3) = 0
...
mzhai:/dev/coreutils$ man copy_file_range
COPY_FILE_RANGE(2) Linux Programmer's Manual COPY_FILE_RANGE(2)
NAME
copy_file_range - Copy a range of data from one file to another
SYNOPSIS
#define _GNU_SOURCE
#include <unistd.h>
ssize_t copy_file_range(int fd_in, loff_t *off_in,
int fd_out, loff_t *off_out,
size_t len, unsigned int flags);
DESCRIPTION
The copy_file_range() system call performs an in-kernel copy between two file descriptors without the additional
cost of transferring data from the kernel to user space and then back into the kernel. It copies up to len bytes
of data from the source file descriptor fd_in to the target file descriptor fd_out, overwriting any data that ex‐
ists within the requested range of the target file.
比较performance,新版cp稍稍快那么一点点
mzhai:/dev/coreutils$ time ./src/cp ~/output_file ~/output_file2
real 0m1.620s
user 0m0.000s
sys 0m0.864s
mzhai:/dev/coreutils$ time ./src/cp ~/output_file ~/output_file2
real 0m1.560s
user 0m0.000s
sys 0m0.784s
mzhai:/dev/coreutils$ time ./src/cp ~/output_file ~/output_file2
real 0m1.492s
user 0m0.000s
sys 0m0.744s
mzhai:/dev/coreutils$ time ./src/cp ~/output_file ~/output_file2
real 0m1.526s
user 0m0.000s
sys 0m0.742s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2
real 0m1.643s
user 0m0.000s
sys 0m0.906s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2
real 0m1.600s
user 0m0.000s
sys 0m0.851s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2
real 0m1.564s
user 0m0.000s
sys 0m0.853s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2
real 0m1.580s
user 0m0.000s
sys 0m0.848s
mzhai:/dev/coreutils$ time cp ~/output_file ~/output_file2
real 0m1.533s
user 0m0.000s
sys 0m0.835s
mzhai:/dev/coreutils$