Linux之管道性能

最新推荐文章于 2024-07-17 20:41:49 发布

杨宗喜

最新推荐文章于 2024-07-17 20:41:49 发布

阅读量1.4k

点赞数 25

文章标签： linux 运维服务器笔记学习

本文链接：https://blog.csdn.net/qq_33992705/article/details/136484430

版权

1.管道有多大？

测试系统：Ubuntu22.04

1.1可以确保写入不超过 PIPE_BUF 字节的操作是原子的

如果多个进程写入同一个管道，那么如果它们在一个时刻写入的数据量不超过 PIPE_BUF
字节，那么就可以确保写入的数据不会发生相互混合的情况。在 Linux 上，PIPE_BUF 的值为 4096。¹

可以通过ulimit -a查看

> ulimit -a
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) 0
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 127365
max locked memory           (kbytes, -l) 4085264
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8       # 512 * 8 = 4096
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 127365
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

1.2查看管道默认值

我们可以通过man 7 pipe（注意不要用man pipe，这个信息少）查看：

In Linux versions before 2.6.11, the capacity of a pipe was the same as the system page size (e.g., 4096 bytes on i386). Since Linux 2.6.11, the pipe capacity is 16 pages (i.e., 65,536 bytes in a system with a page size of 4096 bytes). Since Linux 2.6.35, the default pipe capacity is 16 pages, but the capacity can be queried and set using the fcntl(2) F_GETPIPE_SZ and F_SETPIPE_SZ operations.

大概意思就是以前pipe的大小等于系统的页大小（page size，"页"的概念操作系统会讲到），但是现在等于16张页大小，每张页大小为4096bytes，所以linux下默认管道大小为4096 * 16 = 65536bytes，对应的源代码设置在
/usr/src/linux-headers-6.5.0-21-generic/include/linux/pipe_fs_i.h（具体内核小版本可能不一样），也可以在这里看到/include/linux/pipe_fs_i.h
在这里插入图片描述
我们可以写一个简短的程序来测量上述说明是否正确。该程序创建了一个管道，只往管道中写数据，如果写入完成，那么write会返回且printf会执行；否则程序阻塞。

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    if (argc != 2)
    {
        printf("<usage>: %s size\n", argv[0]);
        return 1;
    }
    long len = atol(argv[1]);
    int pipefd[2];
    pipe(pipefd);
    char *buf = (char*)malloc(sizeof(char) * len);
    write(pipefd[1], buf, len);
    printf("finished\n");
    free(buf);
    return 0;
}

运行结果如下，写入65536bytes时能正确写入，65537bytes会阻塞住。
在这里插入图片描述

1.3查看管道最大值

zongxiyang@ubuntu22-04 ~ $ cat /proc/sys/fs/pipe-max-size 
1048576

上述man pipe(7)手册已经提到，管道的大小可以通过fcntl(2)进行动态增长，最大值为1MiB。

2.管道性能测量

2.1Ping-Pong Game

现在我们来写这样一个程序：在parent中创建两个管道，然后fork出child，parent关闭pipe1的读端和pipe2的写端；child关闭pipe1的写端和pipe2的读端，parent向pipe1中写，从pipe2中读；child向pipe2中写，从pipe1中读，这样一个来回算一次ping-pong，统计一秒钟执行多少次ping-pong，吞吐量（throughput）大致就可以算出来，等于count * unit_data。（这种方式对吗？）

#include <unistd.h>
#include <wait.h>
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
static unsigned long cnt = 0;
long len = 0;
int total = 0;
void* func_callback(void* arg)
{
    int idx = 0;
    double sum = 0;
    while (idx ++ < total)
    {
        sleep(1);
        double unit_data = cnt * len / 1024 / 1024.0;
        sum += unit_data;
        cnt = 0;
        printf("%2d.bytes per second: %.3lf MiB\n", idx, unit_data);
    }
    printf("average throughput: %.3lf MiB\n", sum / total);
}

int main(int argc, char *argv[])
{
    if (argc != 3)
    {
        printf("<usage>: %s block size, total count\n", argv[0]);
        return 1;
    }
    len = atol(argv[1]);
    total = atoi(argv[2]);
    pid_t pid;
    int pipe1[2], pipe2[2];
    pipe(pipe1);
    pipe(pipe2);
    if ((pid = fork()) == 0) // child
    {
        close(pipe1[1]);
        close(pipe2[0]);
        char buf[len];
        while (1)
        {
            read(pipe1[0], &buf, len);
            write(pipe2[1], &buf, len);
        }   
    }
    else // parent
    {
        close(pipe1[0]);
        close(pipe2[1]);
        char *buf = (char*)malloc(sizeof(char) * len);
        pthread_t tid;
        pthread_create(&tid, NULL, func_callback, NULL);
        pthread_detach(tid);
        while (1)
        {
            write(pipe1[1], buf, len);
            read(pipe2[0], buf, len);
            ++ cnt;
        }
        free(buf);
    }
    return 0;
}

以下数据是在每种情况测量10次后取平均值的结果，单位MiB/s

单位数据量（Byte）	平均吞吐量（MiB/s）
1024	329.584
2048	588.649
4096	1032.530
8192	1436.310
16384	1992.053
32768	2466.316
65536	3203.719
65537	0.2
131072	0.013

生成折线图如下
在这里插入图片描述
单位数据量为65536之后吞吐量急剧下降，事实上只有第一次读出来了一点数据，然后发生了死锁，运行过程中可通过ps aux | grep procname查看进程状态，结果如下

> ps aux | grep app
zongxiy+ 1764641  0.0  0.0  76512  1792 pts/2    S+   09:51   0:00 ./app 65537 10
zongxiy+ 1764642  0.0  0.0   2648   128 pts/2    S+   09:51   0:00 ./app 65537 10

第8列表示进程状态，S代表sleep，+代表前台进程组。进入到/proc/pid目录下，查看stack文件可看到进程阻塞在pipe_write调用上（调用链从下往上看），另一个子进程同理

> sudo cat stack
[<0>] pipe_write+0x50f/0x710                                                                          
[<0>] vfs_write+0x394/0x440                                                                           
[<0>] ksys_write+0xc9/0x100                                                                           
[<0>] __x64_sys_write+0x19/0x30                                                                       
[<0>] do_syscall_64+0x58/0x90                                                                         
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

对于阻塞IO来说，当单次写入的数据超过buffer size的时候程序会阻塞住，socket也是一样（阻塞模式的socket在没有规定消息格式（分包）的前提下写入超过TCP缓冲区大小的数据时会阻塞住）。

2.1.1结果分析

这种测量方式其实低估了管道的性能，因为进程A写入pipe1后要等进程B读取到用户态buffer，再从用户态buffer拷贝到pipe2后才能读，我们测的是一个进程一秒钟执行了多少次read、write的数据量，所以测试结果会偏低，我们进行如下改进

write.c向STDOUT_FILENO写入固定量的数据
read.c从STDIN_FILENO读取固定量的数据
利用shell的管道（|）来实现

2.2改进代码

具体代码如下
write.c

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const size_t buffer_size = 1 << 16; // 64KiB
#define ITERATION 40960
void write_n(int fd, char* buf, size_t buffer_size)
{
    size_t remaining = buffer_size;
    while (remaining > 0)
    {
        // The case of being interrupted by a signal is not handled
        ssize_t written = write(fd, buf + (buffer_size - remaining), remaining);
        remaining -= written;
    }
}

int main(int argc, char *argv[])
{
    char *buf = (char*)malloc(sizeof(char) * buffer_size);
    memset(buf, 'Z', buffer_size);
    int cnt = 0;
    while (cnt ++ < ITERATION)
    {
        write_n(STDOUT_FILENO, buf, buffer_size);
    }
    free(buf);
    return 0;
}

read.c

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
const size_t buffer_size = 1 << 16; // 64KiB
const double esp = 1e6;
#define ITERATION 40960
void read_n(int fd, char* buf, size_t buffer_size)
{
    size_t remaining = buffer_size;
    while (remaining > 0)
    {
        // The case of being interrupted by a signal is not handled
        ssize_t readn = read(fd, buf + (buffer_size - remaining), remaining);
        remaining -= readn;
    }
}

int main(int argc, char *argv[])
{
    char *buf = (char*)malloc(sizeof(char) * buffer_size);
    size_t sum = buffer_size * ITERATION / 1024 / 1024 / 1024;
    struct timeval start, end;
    gettimeofday(&start, NULL);
    int cnt = 0;
    while (cnt ++ < ITERATION)
    {
        read_n(STDIN_FILENO, buf, buffer_size);
    }
    gettimeofday(&end, NULL);
    double start_time = start.tv_sec + start.tv_usec / 1e6;
    double end_time = end.tv_sec + end.tv_usec / 1e6;
    printf("%.3lf GiB/s\n", sum / (end_time - start_time));
    free(buf);
    return 0;
}

编译后运行5次，结果如下

> for i in {1..5}; do ./write | ./read; done
4.405 GiB/s
4.255 GiB/s
4.148 GiB/s
4.241 GiB/s
4.235 GiB/s

这个结果就是答案吗？在这里我们单次write的数据量是固定的64KiB，循环40960次（共2.5GiB数据），我们分别循环4096（0.25GiB）、409600（25Gib）、4096000（250GiB）次试试，结果如下

数据量（GiB）	吞吐量（GiB/s）
0.25	4.826
2.5	4.257
25	5.448
250	5.385

生成折线图如下
在这里插入图片描述

详细数据需要进行大量的测试，以上测量结果可能是误差，也可能跟数据量大小有关系，也可能跟系统中其他进程抢占资源有关系，数据仅供参考，欢迎在评论区讨论。

2.3 pv工具

pv（pipe viewer）是一个流经pipe的数据monitor，我们可以用这个工具直接对./write程序使用，结果如下

> ./write | pv > /dev/null
250GiB 0:00:31 [7.99GiB/s] [

这种测量结果更大，我猜测它测量的是流入/流出管道的数据量，我们第一种是两次read和两次write，第二种是一次read和一次write，第三种是一次read或一次write。

3.及时关闭管道未使用的一端

3.1为什么要关闭写端？

使用管道的时候应该及时关闭未使用的一端，假设A进程从管道中读取数据，应该关闭管道的写端，这样其他进程写完数据并关闭写端的时候A进程在读端能够读到EOF，否则最终会阻塞在read调用上（因为A进程本身也持有管道的写端，虽然从未用过）。

3.2为什么要关闭读端？

当一个进程试图向一个管道中写入数据但没有任何进程拥有该管道的打开着的读取描述符时，内核会向写入进程发送一个SIGPIPE信号（反之则不会，读端会受到EOF）。在默认情况下，这个信号会杀死一个进程。但进程可以捕获或忽略该信号，这样就会导致管道上的write操作因EPIPE错误（已损坏的管道）而失败。收到SIGPIPE信号或得到EPIPE错误对于标示出管道的状态是有用的，这就是为何需要关闭管道的未使用读取描述符的原因。¹
SIGPIPE信号测试代码如下

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>

void sigpipe_handler(int sig)
{
    printf("SIGPIPE\n");
}

int main(int argc, char *argv[])
{
    if (argc != 2)
    {
        printf("<usage>: %s size\n", argv[0]);
        return 1;
    }
    signal(SIGPIPE, &sigpipe_handler); // register the signal handler
    long len = atol(argv[1]);
    int pipefd[2];
    pipe(pipefd);
    char *buf = (char*)malloc(sizeof(char) * len);
    close(pipefd[0]); // close the read end of the pipe
    write(pipefd[1], buf, len);
    printf("finished\n");
    free(buf);
    return 0;
}