标准IO-CSDN博客

本文链接：https://blog.csdn.net/u014787464/article/details/42087475

标准IO都是围绕流进行

流的定向决定了单字节还是多字节，最初创建的流没有定向。

freopen（3）清除一个流的定向， fwide（3）设置流的定向

标准IO提供的目的：减少使用read和write的次数，对每个IO流进行自动缓冲管理

更改缓冲的类型： setbuf(3) setvbuf(3)

使用它们必须要在流已经被打开之后调用；

可以使用setbuf函数打开或者关闭缓冲机制，参数buf需要指向一个长度为BUFSIZ（stdio.h定义）的缓冲区

setvbuf的参数mode可以指定缓冲的类型：

_IOFBF 全缓冲

_IOLBF 行缓冲

_IONBF 不带缓冲

因此，setbuf(3) 等价于 setvbuf(3) 的表示法： setvbuf(fp, buf, buf ? _IOFBF : _IONBF, BUFSIZ);

setlinebuf(3) 等价于： setvbuf(fp, NULL, _IOLBF, 0);

标准IO提供三种缓冲方式：

一：全缓冲

fflush（3）

所谓冲洗 flush（3）：对标准IO缓冲区的写操作

1：在标准IO库方面，flush 意味着将缓冲区中的内容写道磁盘上

2：在终端驱动程序方面，flush表示丢弃已经存储在缓冲区中的数据

二：行缓冲

行缓冲的两个限制

1：标准IO收集每一行缓冲区的长度是固定的，只要填满了缓冲区，即使还没有写一个换行符，也进行IO操作

2：任何时候只要通过标准IO库要求从（a）一个不带缓冲的流，或者（b）一个行缓冲的流（从内核请求数据）得到输入数据，就会冲洗所有行缓冲输出流。

三：不带缓冲

标准错误流 stderr 通常是不带缓冲的，出错信息能够迅速显示出来。

ISO C要求下列缓冲特征：

一：当且仅当标准输入和标准输出不指向交互性设备时，它们才是全缓冲的。

二：标准错误不会是全缓冲的

大多数系统默认如下实现：

一： stderr 是不带缓冲的

二：若是指向终端设备的流，则是行缓冲的；否则是全缓冲的。

在打开流之后，使用 setbuf 或者 setvbuf 能够修改缓冲类型

使用 fflush 可以冲洗流，NULL参数表示所有输出流被冲洗

打开流：

fopen（3） freopen（3） fdopen（3）

其中：freopen可以清除流的定向。freopen一般用于将指定的文件打开为一个预订的流：stdin，stdout，stderr

fdopen取一个已经有的文件描述符（open，dup，dup2，fcntl，pipe，socket，socketpair 或 accept 得到fd ），并使一个标准的IO流与该描述符相结合。通常用于由创建管道和网络通信通道函数返回的描述符。特殊文件不能用fopen打开，因此必须先调用设备专用函数获得文件描述符，然后用fdopen使得标准IO与该 fd 结合。

输入的参数作为标准输入和标准输出，将stdin stdout 与他们绑定起来：

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define BUFSIZE 4096

int main(int argc, char *argv[])
{
    char buf[BUFSIZE];
    int n;

    if (argc != 3) {
        perror("argument error: not enough"); 
        exit(-1);
    }

    FILE *fp1, *fp2;    
    if ((fp1 = freopen(argv[1], "r+", stdin)) == NULL) {
        perror("freopen stdin error"); 
        exit(-1);
    }

    if ((fp2 = freopen(argv[2], "w+", stdout)) == NULL) {
        perror("freopen stdout error"); 
        exit(-1);
    }
    
    while ((n = read(STDIN_FILENO, buf, BUFSIZE)) > 0) {
        if (write(STDOUT_FILENO, buf, n) != n) {
            perror("write error"); 
            exit(-1);
        }
    }

    fclose(fp1);
    fclose(fp2);
    return EXIT_SUCCESS;
}

读写流：

读取一个字符： getc fgetc getchar

#include <apue.h>

int main(void)
{
    int c;
    
    while ((c = getc(stdin)) != EOF)
        if (putc(c, stdout) == EOF)
            perror("putc error");

    if (ferror(stdin))
        perror("stream error");

    return EXIT_SUCCESS;
}

#include <apue.h>

int main(void)
{
    int c;

    while ((c = fgetc(stdin)) != EOF)
        if (fputc(c, stdout) == EOF)
            perror("fputc error");

    if (ferror(stdin))
        perror("stdin error");

    return EXIT_SUCCESS;
}

区分出错或者文件尾： ferror（3） feof（3）

在大多数实现中，为每个流在FILE对象中维护了两个标志：出错标志，文件结束标志

使用 clearerr（3）可以清除这两个标志

每次一行IO :

fgets（3） fputs（3）（这两个必须在每行终止处自己处理换行符！！！）

#include <apue.h>

#ifndef MAXLINE
#define MAXLINE 4096
#endif

int main(void)
{
    char buf[MAXLINE];

    while (fgets(buf, MAXLINE, stdin) != NULL)
        if (fputs(buf, stdout) == EOF)
            perror("fputs error");

    if (ferror(stdin))
        perror("input error");

    return EXIT_SUCCESS;
}

小结下：

使用标准IO的优点是，不用考虑缓冲以及最佳IO长度的选择（直接使用read和write）。

在使用fgets时需要考虑最大行长，但是与选择最佳IO长度比较，要方便的多。

系统调用与普通的函数调用相比，要花费更多的时间。

标准IO库与直接调用read和write函数相比并不会慢很多。

对于大多数比较复杂的程序，最主要的用户CPU时间是由应用本身的各种处理消耗的，而不是由标准IO例程消耗的。

二进制IO：

fread（3）fwrite（3）

操作二进制文件的读写，返回读或者写的对象数。

对于读：出错或者到达文件尾，应该用 ferror(3) 或者 feof(3) 来判断

对于写：返回值少于 nobj ，则出错

局限：

二进制IO的基本问题，它只能用于读在同一系统上已写的数据。

如今很多异构系统通过网络相互连接起来，比如在一个系统上写，要在另一个系统上读，此时fread（3）和fwrite（3）就不能使用了，原因如下:

no.1：一个结构中，同一个成员的偏移量可能随编译程序和系统的不同而不同

no.2：用于存储多字节整数和浮点值的二进制格式在不同的系统结构之间也可能不同。

这个问题在socket套接字编程的时候需要考虑。

在不同的系统之间交换二进制数据的实际解决方案，是使用互认的规范格式

两种常见的用法：

1：读或者写一个二进制数组。

比如将一个float数组的第2~5个元素写到一个文件上

float data[10];

if (fwrite(&data[1], sizeof(float), 4, fp) != 4)
    perror("fwrite error");

2：读或写一个结构。

比如可以

struct {
    short count;
    long total;
    char name[NAMESIZE];
} item;

if (fwrite(&item, sizeof(ITEM), 1, fp) != 1)
    perror("fwrite error");

把他们结合起来，读或者写一个结构数组。注意 man 手册看清楚参数的意义以及NOTE，return value等信息

#include <apue.h>

#define NAMESIZE 100

typedef struct ITEM {
    short count;
    long total;
    char name[NAMESIZE];
    char *pointer;
}item;

int main(void)
{
    /* write example 1 */
    float data[10]= {10, 22, 43, 45, 1, 345, 777, 23, 98, 100};
    
    if (fwrite(&data[1], sizeof(float), 4, stdout) != 4) {
        perror("fwrite 1 error"); 
        exit(-1);
    }
    
    /* write example 2 */
    item item_test;
    item_test.count = 123;
    item_test.total = 65535;
    item_test.pointer = "pointer";
    if (fwrite(&item_test, sizeof(item), 1, stdout) != 1) {
        perror("fwrite 2 error"); 
        exit(-1);
    }

    return EXIT_SUCCESS;
}

#include <apue.h>

typedef struct ITEM{
    short arg1;
    int arg2;
    float arg3;
    long arg4;
    char *arg5;
}item;

int main(void)
{
    /* test 1 */
    /*
    item test1, test2;
    test1.arg1 = test2.arg1 = 1;
    test1.arg2 = test2.arg2 = 123;
    test1.arg3 = test2.arg3 = 123.123;
    test1.arg4 = test2.arg4 = 1234;
    test1.arg5 = "abcd";
    test2.arg5 = "efgh";
    
    item array[2] = {test1, test2};
    if (fwrite(&array, sizeof(item), 2, stdout) != 2) {
        perror("fwrite error"); 
        exit(-1);
    }
    */
    
    /* test 2 */
    
    item array_2[2];
    if (fread(&array_2, sizeof(item), 2, stdin) != 2) {
        perror("fread error"); 
        exit(-1);
    }

    fprintf(stdout, "no 1: %d, %d, %f, %ld, %s\n", array_2[0].arg1, array_2[0].arg2, array_2[0].arg3, array_2[0].arg4, array_2[0].arg5);
    fprintf(stdout, "no 2: %d, %d, %f, %ld, %s\n", array_2[1].arg1, array_2[1].arg2, array_2[1].arg3, array_2[1].arg4, array_2[1].arg5);
   
    return EXIT_SUCCESS;
}

我的centos7用的utf-8编码，这里读取出来的字符串是乱码，猜想是对齐的问题，应该用定长的类型。

定位流：

有三种方法定位标准IO流：

1：ftell和 fseek 函数（文件的位置可以存放在long里面）

2：ftello 和 fseeko 函数（使用 off_t 代替了长整形）

3：fgetpos 和 fsetpos 函数（抽象数据类型 fpos_t 记录文件位置，需要移植到非UNIX系统的程序应当使用这两个）

rewind（3）可以将一个流设置到文件的起始位置。

ISO C不要求对二进制文件支持SEEK_END规格。但是Unix是支持的。

为了定位一个文本文件，fseek(3)的whence一定要设置成SEEK_SET，而且offset只能有两种值：

No.one：0 （后退到文件的起始位置）

No.two：对该文件的 ftell(3) 的返回值

格式化IO：

格式化输出

printf(3)

fprintf(3)

dprintf(3) （不需要调用 fdopen(3) 将 fd 转换为 FILE * ，而 fprintf(3) 需要）

sprintf(3) （没有限制长度，会导致缓冲区溢出）

snprintf(3) （添加字段 size_t n，使得长度是一个显示参数，超过缓冲区尾部的被丢弃）

格式化输入

scanf(3)

fscanf(3)

sscanf(3)

临时文件：

使用 tmpnam（3）和 tmpfile（3）实例：

#include <apue.h>

int main(void)
{
    char name[L_tmpnam], line[MAXLINE]; //4096
    FILE *fp;

    fprintf(stdout, "%s\n", tmpnam(NULL));

    if (tmpnam(name) == NULL)
        perror("tmpnam error");

    fprintf(stdout, "%s\n", name);

    if ((fp = tmpfile()) == NULL)
        perror("tmpfile error");
    fprintf(fp, "one line of output\n");
    rewind(fp);

    if (fgets(line, sizeof(line), fp) == NULL)
        perror("fgets error");
    fprintf(stdout, "%s\n", line);

    return(0);
}

使用 tmpnam（3）和tempnam（3）至少有一个缺点：在返回唯一路径名和用该名字创建文件之间存在一个时间窗口，在这个时间之中，另一个进程可以用相同的名字创建文件。因此，应该使用tmpfile（3）和mkstemp（3），他们不存在这个问题。

#include <apue.h>
#include <errno.h>

void make_temp(char *template);

int main(void)
{
    char tmpfile1[] = "/tmp/dirXXXXXX";
    char *tmpfile2 = "/tmp/dirXXXXXX";

    printf("trying first time to create a file...\n");
    make_temp(tmpfile1);
    printf("trying second time to create a file...\n");
    make_temp(tmpfile2);
    return EXIT_SUCCESS;
}

void make_temp(char *template)
{
    int fd;
    struct stat statbuf;

    if ((fd = mkstemp(template)) < 0)
        perror("mkstemp error");
    fprintf(stdout, "temp name = %s\n", template);
    close(fd);

    if (stat(template, &statbuf) < 0) {
        if (errno == ENOENT) 
            fprintf(stdout, "file does't exist\n");
        else
            perror("stat failed");
    } else {
        fprintf(stdout, "file exist\n"); 
        unlink(template);
    }
}

mkstemp 的参数，从man手册里面可以知道，不能传递 string constant ，而要传递 character array，因为数组的名字在栈上面分配额。mkstemp 需要修改字符串xxxxxx 替换为随即串，出现了段错误。（segment fault）

内存流：

fmemopen(3)// open a stream that permits the access specified by mode

open_memstream(3) // open a stream for writing to a buffer, after use you should free(3) ( for byte )

open_wmemstream(3) // for wide byte

在缓冲区地址和大小的使用上，需要遵循一些规则：

no 1：缓冲区地址和长度只有在调用了fclose(3) 或fflush(3)后才有效

no 2：这些值只有在下一次流写入或调用fclose前才有效。因为缓冲区可以增长，可能需要重新分配。

因为避免了缓冲区溢出，内存流非常适用于创建字符串。

内存流只访问主存，不访问磁盘上的文件，所以对于将标准IO作为参数用于临时文件的函数，性能有很大提升。

eg1：fmemopen(3)打开一个输入流，open_memstream(3)打开一个输出流（自动申请内存空间，之后需要我们free(3) ），扫描输入串获取int，并将它写入输出流缓存。

$ ./a.out '1 23 34 56'

#include <apue.h>

#define handle_error(msg) \
    do { perror(msg); exit(EXIT_FAILURE); } while(0)

int main(int argc, char *argv[])
{
    FILE *in, *out;
    int v, s;
    size_t size;
    char *ptr;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s <file>\n", argv[0]); 
        return EXIT_FAILURE;
    }

    in = fmemopen(argv[1], strlen(argv[1]), "r");   //open a mem stream
    if (in == NULL)
        handle_error("fmemopen");

    out = open_memstream(&ptr, &size);      //open a stream for writing to a buffer
    if (out == NULL)
        handle_error("open_memstream");
   
    while (1) {
        s = fscanf(in, "%d", &v); 
        if (s <= 0)
            break;

        s = fprintf(out, "%d ", v * v);
        if (s == -1)
            handle_error("fprintf");
    }
    fclose(in);
    fclose(out);
    printf("size = %ld, ptr = %s\n", (long) size, ptr);
    free(ptr);
    return EXIT_SUCCESS;
}

eg2：在我们自己提供的缓存上操作内存流

#include <apue.h>

#define handle_error(msg) \
    do { perror(msg); exit(EXIT_FAILURE); } while(0)

#define BSZ 48

int main(void)
{
    FILE *fp;
    char buf[BSZ];

    memset(buf, 'a', BSZ-2);
    buf[BSZ-2] = '\0';
    buf[BSZ-1] = 'X';

    fp = fmemopen(buf, BSZ, "w+");
    if (fp == NULL)
        handle_error("fmemopen");
    printf("initial buffer contents: %s\n", buf);
    fprintf(fp, "hello, world");
    printf("before flush: %s\n", buf);
    fflush(fp);
    printf("after flush: %s\n", buf);
    printf("len of string in buf = %ld\n", (long) strlen(buf));

    memset(buf, 'b', BSZ-2);
    buf[BSZ-2] = '\0';
    buf[BSZ-1] = 'X';
    fprintf(fp, "hello, world");
    fseek(fp, 5, SEEK_SET); 
    printf("after seek: %s\n", buf);
    printf("len of string in buf = %ld\n", (long) strlen(buf));

    memset(buf, 'c', BSZ-2);
    buf[BSZ-2] = '\0';
    buf[BSZ-1] = 'X';
    fprintf(fp, "hello, world");
    fclose(fp);
    printf("after fclose: %s\n", buf);
    printf("len of string in buf = %ld\n", (long) strlen(buf));

    return EXIT_SUCCESS;
}

输出结果：

initial buffer contents: 
before flush: 
after flush: hello, world
len of string in buf = 12
after seek: bbbbbbbbbbbbhello, world
len of string in buf = 24
after fclose: ccccchello, worldccccccccccccccccccccccccccccc
len of string in buf = 46

fmemopen在缓冲区开始处放置null字节，只有fflush冲洗后缓冲区才发生变化

feek也能引起缓冲区冲洗，并且FILE流会记录我们设置的偏移量。