（万字长文）Linux——IO之重定向+缓冲区 +重定向 +缓冲区原理实现 +带重定向的简易版shell+标准输出标准错误

每天少点debug

已于 2023-05-26 18:51:37 修改

阅读量1.4k

点赞数 1

文章标签： linux 运维服务器 c++ 开发语言

于 2023-05-26 18:13:09 首次发布

本文链接：https://blog.csdn.net/cxy_zjt/article/details/130855852

版权

文章详细探讨了Linux系统中文件描述符的分配规则，以及缓冲区在提高效率中的作用。通过示例代码解释了重定向如何影响输出，特别是缓冲区在其中的角色。缓冲区分为无缓冲、行缓冲和全缓冲，且在进程退出或特定函数调用时进行刷新。文章还介绍了自定义实现的缓冲区模拟代码，并讨论了标准输入和标准错误的重定向方法，强调了正确重定向两者到同一文件的技巧。

摘要由CSDN通过智能技术生成

文件描述符分配规则

文件描述符的分配规则
从头遍历数组fd_array[],找到一个最小的，没有被使用的下标，分配给新的文件。

int main()
  7 {
  8   close(0);
  9   int fd = open("log.txt", O_WRONLY | O_CREAT | O_TRUNC, 0666);
 10   if(fd < 0) {
 11     perror("open");
 12     return 1;
 13   }
 14 
 15 
 16 
 17   fprintf(stdout, "打开文件成功, fd : %d\n", fd);//将特定的字符串格式化写到特定的流中
 18   close(fd);
 19   return 0;
 20 }

如上所示，如果我们一开始将文件描述符0关掉，那么此时我们打开"log.txt’系统会为其分配的文件描述符为0
在这里插入图片描述
再看一个例子
如果此时我们将fd = 1 关闭，此时1这个文件描述符就会给新的文件log.txt
正常情况下如果我们cat log.txt的时候会打印出内容，因为我们在一开始关闭了显示器的标准stdout，然后再打开log.txt的时候其文件描述符会被自动填充为stdout.
但是运行之后发现
在这里插入图片描述
显示不出来，此时如果在源码中加上fflush（stdout）发现就可以cat log.txt就可以打印出来了

这是为什么呢？
先看下面的重定向和缓冲区的理解

重定向

在这里插入图片描述

在这里插入图片描述
所以也就不难理解追加重定向了，追加重定向就是将打开文件的方式修改一下。

输入从定向，就是dup2(fd, 0)

缓冲区

1.什么是缓冲区

缓冲区的本质：就是一段内存

解放使用缓冲区的进程时间
缓冲区的存在集中处理数据刷新，减少IO的次数，从而达到提高整机效率的目的

2.缓冲区在哪里

先看一段代码

1 #include<stdio.h>
    2 #include<sys/types.h>
    3 #include<sys/stat.h>
    4 #include<fcntl.h>
    5 #include<unistd.h>
    6 #include<string.h>
    7 int main()
    8 {
    9   printf("hello printf");//printf默认输出时候的文件描述符是1
W> 10   const char *msg = "hello write";
   11   write(1, msg, strlen(msg));//将msg写到标准输入中
   12                                                                                                   
   13   sleep(5);

此时发现先输出hello write 五秒钟之后再输出hello printf

 printf("hello printf");//printf默认输出时候的文件描述符是1
 10   fprintf(stdout, "hello fprintf");
 11   fputs("hello fputs", stdout);                                                                     
 12   const char *msg = "hello write";
 13 
 14   write(1, msg, strlen(msg));//将msg写到标准输入中
 15 
 16   sleep(5);

将代码改成上述发现还是跟原来的现象一样，hello write先打印出来，然后再打印出printf fprintf fputs 的内容
为什么会这样？
可以证明是有缓冲区的，printf和fprintf fputs都是封装了write系统调用接口的，所以缓冲区必然不在write中
发现三个C语言接口的函数都有一个共同点，都有stdout,
在这里插入图片描述
stdout是FILE类型，其是一个结构体，而在结构体中除了有文件描述符外，还有语言级别的缓冲区。

所以缓冲区是语言级别的缓冲区
什么时候刷新？
常规

无缓冲（立即刷新）
行缓冲（逐行刷新，显示器的文件）
全缓冲（缓冲区满刷新）这个对应的是磁盘文件

特殊

进程退出，C语言的强制刷新
用户强制刷新 fflush

重定向

提问：如果在刷新之前，关闭了fd会有什么影响
先看不关闭fd时候的重定向

19  int fd = open("log.txt", O_WRONLY | O_CREAT | O_TRUNC, 0666);
 20 if(fd < 0) {
 21   perror("open");
 22   return 1;
 23 }
 24   dup2(fd, 1);
 25 
 26  printf("hello printf");//printf默认输出时候的文件描述符是1
 27  fprintf(stdout, "hello fprintf");
 28  fputs("hello fputs", stdout);
 29  const char *msg = "hello write";
 30   write(1, msg, strlen(msg));

在这里插入图片描述
如果我们在刷新之前关闭的话

printf("hello printf");//printf默认输出时候的文件描述符是1
 27  fprintf(stdout, "hello fprintf");
 28  fputs("hello fputs", stdout);
 29  const char *msg = "hello write";
 30   write(1, msg, strlen(msg));
 31  close(1);

在这里插入图片描述
此时发现log.txt中只有write写入的内容，再一次证明了上述我们的结论，调用C语言的文件接口，就是通过fd找到对应的write，如果fd关闭了，就无法再显示了。
现在就可以理解最开始的问题了，如果将标准输出重定向到log.txt，关闭了fd之后就无法刷新了

再看一段代码

 const char *str1 = "hello printf\n";
   11   const char *str2 = "hello fprintf\n";
   12   const char *str3 = "hello fputs\n";
   13   const char *str4 = "hello write\n";
W> 14   printf(str1);
W> 15   fprintf(stdout, str2);
   16   fputs(str3, stdout);
   17   //系统接口
   18   write(1, str4, strlen(str4));
   19   //调用结束上面的代码，执行fork
   20   fork();

上述代码的话，运行后的结果是
在这里插入图片描述
发现添加了重定向之后，此时代码打印是7行，其中C语言接口的函数各打印两条，write无论何种情况都只打印一条，为什么呢？
**代码的最后fork创建了子进程，父子进程代码共享，数据起初也是共享的，因为我们将打印的结果重定向到log.txt了，而log.txt是一个磁盘文件，其刷新的条件是当缓冲区满的时，或者是进程退出的时候，会清空缓冲区，而无论父子进程谁先发生清空，数据都要发生写实拷贝，所以父进程刷新一份数据，子进程刷新一份数据，就是两份数据了，所以才会出现上述情况。

源码模拟实现缓冲区原理

#include<stdio.h>
#include<unistd.h>
#include<stdlib.h>
#include<string.h>
#include<sys/types.h>
#include<sys/stat.h>
#include<fcntl.h>
#include<assert.h>

#define NUM 1024

#define NONE_FLUSH 0x0 //没有刷新
#define LINE_FLUSH 0x1 //行刷新
#define FULL_FLUSH  0x2 //全满才刷新



typedef struct _MyFILE {
  int _fileno;
  char _buffer[NUM];
  int _end;
  int _flags; //刷新方式
}MyFILE;

MyFILE *my_open(const char *filename, const char *method)
{
  assert(filename);
  assert(method);
  int flag = O_RDONLY;
  if(strcmp(method, "r") == 0){
  } else if(strcmp(method, "r+") == 0) {

  } else if(strcmp(method, "w") == 0) {
    flag = O_WRONLY | O_CREAT | O_TRUNC;
  } else if(strcmp(method, "w+") == 0) {

  } else if(strcmp(method, "a") == 0) {

  } else if(strcmp(method, "a+") == 0) {

  } else {
    perror("open error!\n");
  }
  int fileno = open(filename, flag, 0666);
  if(fileno < 0) {
    return NULL;
  }
  MyFILE *fp = (MyFILE*)malloc(sizeof(MyFILE));
  if(fp == NULL) {
    return fp;
  }
  memset(fp, 0, sizeof(MyFILE));
  fp->_fileno = fileno;
  fp->_flags |= LINE_FLUSH;
  fp->_end = 0;
  return fp;

}
void my_fflush(MyFILE *fp)
{
  assert(fp);
  if(fp->_end > 0) {
    write(fp->_fileno, fp->_buffer, fp->_end);
    fp->_end = 0;
    syncfs(fp->_fileno);//将数据从内存刷新到磁盘

  }
}
void my_fwrite(MyFILE *fp, const char *start, int len) 
{
  assert(fp);
  assert(start);
  assert(len);

  //写到缓冲区中
  strncpy(fp->_buffer+fp->_end, start, len);
  fp->_end += len;
  if(fp->_flags & NONE_FLUSH) {

  } else if(fp->_flags & LINE_FLUSH) {
    if(fp->_end > 0 && fp->_buffer[fp->_end-1] == '\n') {
      //仅仅是写到内核中
      write(fp->_fileno, fp->_buffer, fp->_end);
      fp->_end = 0;
      syncfs(fp->_fileno);
    }
  } else if (fp->_flags & FULL_FLUSH) {

  }
 
}


void my_fclose(MyFILE *fp) 
{
  my_fflush(fp);
  close(fp->_fileno);
  free(fp);
  fp = NULL;
}

int main()
{
  MyFILE *fp = my_open("log.txt", "w");
  if(fp == NULL)
  {
    perror("my_open error\n");
    return 1;
  }

 // const char *s = "hello zjt\n";

 // my_fwrite(fp, s, strlen(s));
 // printf("消息立即刷新");
 // sleep(3);
 // 
 // const char *ss = "hello zhang";
 // my_fwrite(fp, ss, strlen(ss));
 // printf("写入了一个不满足条件的字符串\n");
 // sleep(3);

 // const char *sss = "hello jun";
 // my_fwrite(fp, ss, strlen(sss));
 // printf("写入了一个不满足条件的字符串\n");
 // my_fflush(fp); 
 const char *s = "bbbbb-";
 my_fwrite(fp, s, strlen(s));
 printf("写入了一个不满足刷新条件的字符串\n");
 //fork();

 my_fclose(fp);
  
  return 0;
}

最后几行代码如果不加fork()
在这里插入图片描述
将fork()注释解除之后

在fork()函数创建子进程之后，此时因为我们的字符串没有\n 所以其是不支持刷新的，fork()创建子进程之后，父子进程代码数据共享，无论父子进程哪个先调用my_fclose(),其都会清空缓冲区，另外一个进程都会发生写实拷贝，所以父进程刷新一份数据，子进程刷新一份数据

带重定向的简易版Xshell

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <assert.h>
#include <ctype.h>

// #define BUG 1
#define SEP " "
#define NUM 1024
#define SIZE 128

#define DROP_SPACE(s)   \
  do                    \
  {                     \
    while (isspace(*s)) \
      s++;              \
  } while (0)

char command_line[NUM];
char *command_args[SIZE];

char env_buffer[NUM]; // 为了测试

#define NONE_REDIR -1
#define INPUT_REDIR 0
#define OUTPUT_REDIR 1
#define APPEND_REDIR 2

int g_redir_flag = NONE_REDIR;
char *g_redir_filename = NULL;

extern char **environ;
void CheckDir(char *commands)
{
  assert(commands);
  char *start = commands;
  char *end = commands + strlen(commands);

  while (start < end)
  {
    if (*start == '>')
    {
      // 有可能是重定向，也可能是追加重定向
      if (*(start + 1) == '>')
      {
        // 追加重定向
        // ls -a -l >> log.txt
        *start = '\0';
        start += 2;
        g_redir_flag = APPEND_REDIR;
        DROP_SPACE(start); // 防止用户加空格
        g_redir_filename = start;
        break;
      }
      else
      {
        // ls -a -l > log.txt输出重定向
        *start = '\0';
        start++;
        DROP_SPACE(start);
        g_redir_flag = OUTPUT_REDIR;
        g_redir_filename = start;
        break;
      }
    }
    else if (*start == '<')
    {
      // 输出重定向
      *start = '\0';
      start++;
      DROP_SPACE(start);
      g_redir_filename = start;
      g_redir_flag = INPUT_REDIR;
      break;
    }
    else
    {
      start++;
    }
  }
}
int ChangDir(char *newdir)
{
  chdir(newdir);
  return 0;
}
int PutEnvMyshell(char *newenv)
{
  putenv(newenv); // 导出环境变量
}
int main()
{
  // shell本质上是一个死循环
  while (1)
  {
    g_redir_flag = NONE_REDIR; // 每一次循环都要重新定义
    g_redir_filename = NULL;

    // 1.显示提示符
    printf("[zjt@1270.0.1 当前目录]# ");
    fflush(stdout);
    // 获取用户输入
    memset(command_line, '\0', sizeof(command_line) * sizeof(char));
    fgets(command_line, NUM, stdin);               // 键盘，标准输入stdin，获取到的字符串是c风格的字符串以'\0'结尾
    command_line[strlen(command_line) - 1] = '\0'; // 清空\n

    CheckDir(command_line); // 检查路径

    // 3.字符串切分
    command_args[0] = strtok(command_line, SEP);
    int index = 1;
    // 给ls命令添加颜色
    if (strcmp(command_args[0], "ls") == 0)
    {
      command_args[index++] = (char *)"--color=auto";
    }
    // strtok截取成功，返回字符串起始地址
    // 截取失败，返回NULL
    while (command_args[index++] = strtok(NULL, SEP))
      ;

#ifdef BUG
    for (int i = 0; i < index; i++)
    {
      printf("%d : %s\n", i, command_args[i]);
    }
#endif

    // 内建命令的编写
    if (strcmp(command_args[0], "cd") == 0 && command_args[1] != NULL)
    {
      ChangDir(command_args[1]); // 让调用方进行路径切换，父进程
      continue;
    }
    if (strcmp(command_args[0], "export") == 0 && command_args[1] != NULL)
    {
      // 目前环境变量信息在command_line,会被清空
      // 所以我们要自己保存一下环境变量的内容
      strcpy(env_buffer, command_args[1]);
      PutEnvMyshell(env_buffer);
      continue;
    }
    // 创建进程执行
    pid_t id = fork();
    if (id == 0)
    {
      int fd = -1;
      switch (g_redir_flag)
      {
      case NONE_REDIR:
        break;
      case INPUT_REDIR:
        fd = open(g_redir_filename, O_RDONLY);
        dup2(fd, 0);
        break;
      case OUTPUT_REDIR:
        fd = open(g_redir_filename, O_WRONLY | O_CREAT | O_TRUNC);
        dup2(fd, 1);
        break;
      case APPEND_REDIR:
        fd = open(g_redir_filename, O_WRONLY | O_CREAT | O_APPEND);
        dup2(fd, 1);
        break;
      default:
        perror("Bug\n?");
        break;
      }
      // child
      // 程序替换
      execvp(command_args[0], command_args);
      exit(1); // 执行到这里，子进程一定替换失败
    }
    int status = 0;
    pid_t ret = waitpid(id, &status, 0);
    if (ret > 0)
    {
      printf("执行命令成功! sig: %d, code : %d\n", status & 0x7F, (status >> 8) & 0xFF);
    }
  }
  return 0;
}

标准输入和标准错误

#include <iostream>
int main()
{
    // stdout
    printf("hello printf 1\n");
    fprintf(stdout, "hello fprintf 1\n");
    fputs("hello fputs 1\n", stdout);

    // stderr
    fprintf(stderr, "hello fprintf 2\n");
    fputs("hello fputs 2\n", stderr);
    perror("hello perror 2");

    // cout
    std::cout << "hello cout 1" << std::endl;

    // cerr
    std::cerr << "hello cerr 2" << std::endl;
    return 0;
}

先看代码，代码运行后的结果如下所示：
在这里插入图片描述
这个没问题，但是当我们将显示结果重定向后

发现并不是所有的显示结果都会重定向到文件中

如果这样操作的话，此时显示结果被分别重定向到了不同的文件中
为什么呢?
因为默认重定向的话只是将fd = 1的stdout重定向到文本文件中，如果需要重定向标准错误的话需要显示的写
所以上述重定向的标准写法应该是这样的
./a.out 1 > stdout.txt 2>stderr.txt

这么做的意义何在呢？
可以区分哪些是程序的日常输出，哪些是错误！
那么能不能将标准输出和标准错误重定向到一个文件中呢？可以，如何做呢？
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-4TGdyG0N-1685098168077)(null)]
可以注意到上述的perror在输出之后还打印出了success 为什么呢？
perror也是一个库函数，这个函数内部会自己获取errno的值，调用这个函数会直接把错误提示符打印出来,此外，我们也可以在错误提示字符串前添加一些自己想要打印的信息
什么是errno？
errno是一个全局变量，记录最近一次C库函数调用失败原因

我们可以自己实现一下perror
在这里插入图片描述

void my_perror(const char *info)
{
  fprintf(stderr," %s: %s \n", info, strerror(errno));
}
int main()
{
    
  int fd = open("log.txt", O_RDONLY);//此时必定是失败的
  if(fd < 0)
  {
   // perror("open");
    my_perror("open");
    return 1;
  }
  return 0;

在这里插入图片描述
**万字长文结束，我本身是c++ 和Linux的，欢迎互相交流 **

每天少点debug

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
（万字长文）Linux——IO之重定向+缓冲区 +重定向 +缓冲区原理实现 +带重定向的简易版shell+标准输出标准错误

文件描述符分配规则重定向缓冲区1.什么是缓冲区2.缓冲区在哪里重定向源码模拟实现缓冲区原理带重定向的简易版Xshell标准输入和标准错误
复制链接

扫一扫