The Linux Programming Interface
File I/O: The Universal I/O Model
(01) 本章基本内容
We introduce the concept of a file descriptor, and then look at the system calls that constitute the so-called universal I/O model. These are the system calls that open and close file, and read and write data.
(02)文件描述符file descriptor
All system calls for preforming I/O refer to open file using the file descriptor, a (usually small) nonnegative integer. File descriptor are used to refer to all types of open files, including pipes, FIFOs, sockets, terminals, devices, and regular files.
(03) man 3 open
NAME
open - open a file
SYNOPSIS
#include <sys/stat.h>
#include <fcntl.h>
int open(const char *path, int oflag, ... );
DESCRIPTION
The open() function shall establish the connection between a file and a file descriptor. It shall create an open file description that refers
to a file and a file descriptor that refers to that open file description. The file descriptor is used by other I/O functions to refer to
that file. The path argument points to a pathname naming the file.
举例
1 #include <fcntl.h>
2 #include <stdio.h>
3 #include <errno.h>
4
5 int main() {
6 int fd;
7 mode_t mode = S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH;
8 char *filename = "/tmp/file";
9 fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC, mode);
10 if (fd == -1)
11 perror("open");
12 else
13 printf("open successfully\n");
14
15 return 0;
16 }
输出:
wang@wang:~/test$ ./open
open successfully
需要注意的是,The mode argument specifies the permissions to be placed on the file if it is created by this call. If the open() call is not being used to create a file, this argument is ignored and can be omitted.
(04) read 函数使用说明
numread = read(fd, buffer, count) reads at most count byte from the open file referred to by fd and stores them in buffer.
示例:
1 #include <sys/types.h>
2 #include <unistd.h>
3 #include <stdio.h>
4 #include <fcntl.h>
5
6 int main() {
7 char buf[20] = {'0'};
8 size_t nbytes;
9 ssize_t bytes_read;
10 char *filename = "/home/wang/hello.txt";
11 int fd = open(filename, O_RDWR);
12 lseek(fd, 0, SEEK_SET);
13 nbytes = sizeof(buf);
14 bytes_read = read(fd, buf, nbytes);
15 printf("%ld\n%s", bytes_read, buf);
16 close(fd);
17 return 0;
18 }
输出:
wang@wang:~/test$ ./read
13
hello world!
(5) write 和 close综合
numwritten = write(fd, buffer, count) writes up to count bytes from buffer to the open file referred to by fd. The write() call returns the number of bytes actually written, which may be less than count.
status = close(fd) is called after all I/O has been completed, in order to release the descriptor fd and its associated kernel resources.
(6) open read write close 综合举例,相当于简化版的cp命令
1 // 文件属性,结构体形式存在
2 #include <sys/stat.h>
3 #include <fcntl.h>
4 #include "tlpi_hdr.h"
5
6 #ifndef BUF_SIZE
7 #define BUF_SIZE 1024
8 #endif
9
10 int main(int argc, char *argv[])
11 {
12 int inputFd, outputFd, openFlags;
13 mode_t filePerms;
14 ssize_t numRead;
15 char buf[BUF_SIZE];
16 if (argc != 3 || strcmp(argv[1], "--help") == 0)
17 usageErr("%s old-file new-file\n", argv[0]);
18
19 /* Open input and output files */
20 inputFd = open(argv[1], O_RDONLY);
21 if (inputFd == -1)
22 errExit("opening file %s", argv[1]);
23 openFlags = O_CREAT | O_WRONLY | O_TRUNC;
24 filePerms = S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP |
25 S_IROTH | S_IWOTH; /* rw-rw-rw */
26
27 outputFd = open(argv[2], openFlags, filePerms);
28 if (outputFd == -1)
29 errExit("opening file %s", argv[1]);
30
31 /* Transfer data until we encounter end of input or an error */
32 while ((numRead = read(inputFd, buf, BUF_SIZE)) > 0)
33 if (write(outputFd, buf, numRead) != numRead)
34 fatal("couldn't write whole buffer");
35 if (numRead == -1)
36 errExit("read");
37 if (close(inputFd) == -1)
38 errExit("close input");
39 if (close(outputFd) == -1)
40 errExit("close output");
41 exit(EXIT_SUCCESS);
42 }
要包含作者提供的tlpi_hdr.h头文件,进行编译。
新建一个.txt文档,
./copy text.txt /dev/tty
wang@wang:~/test/tlpi-dist/lib$ ./copy text.txt /dev/tty
The Linux Programming Interface
可以看到text.txt中的内容输出到终端上。
(07) flag的值 man 手册中查看。
If an error occurs while trying to open the file, open() return -1, and errno identifies the cause of the error.
A successful call to read() returns the number of bytes actually read, or 0 if end-of-file is encountered. On error, the usual -1 is returned. The ssize_t data type is as signed integer type used to hold a byte count or a -1 error indication.
(08) It is usually good practice to close unneeded file descriptors explicitly, since this makes our code more readable and reliable in the face of subsequent modification. Furthermore, file descriptors are a consumable resource, so failure to close a file descriptor could result in a process running our of descriptors.
(09) lseek函数改变文件的偏移,下次文件读或者写的位置
The lseek() system call adjusts the file offset of the open file referred to by the file descriptor fd, according to the values specified in offset and whence.
lseek(fd, 0, SEEK_SET); /* start of file */
(10) 文件hole,偏移到不存在的位置进行文件操作
The space in between the previous end of the file and the newly written bytes is referred to as a file hole.
From a programming point of view, the bytes in a hole exist, and reading from the hole returns a buffer of bytes containing 0.
File holes don't, however, take up any disk space. The file system doesn't allocate any disk blocks for a hole until, at some later point, data is written into it.
The existence of holes means that a file's nominal size may be larger than the amount of disk storage it utilizes.
#include <sys/stat.h>
#include <fcntl.h>
#include <ctype.h>
#include "tlpi_hdr.h"
int main(int argc, char *argv[]) {
int fd;
char *buf;
size_t len;
off_t offset;
int ap, j;
ssize_t numRead, numWritten;
if (argc < 3 || strcmp(argv[1], "--help") == 0)
usageErr("%s file {r<lenght>|R<length>|w<string>|s<offset>}...\n", argv[0]);
fd = open(argv[1], O_RDWR | O_CREAT,
S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP |
S_IROTH | S_IWOTH); /* rw-rw-rw */
if (fd == -1)
errExit("open");
for (ap = 2; ap < argc; ap++) {
switch(argv[ap][0]) {
case 'r': /* display bytes at current offset, as text */
case 'R': /* display bytes at current offset, in hex */
len = getLong(&argv[ap][1], GN_ANY_BASE, argv[ap]);
buf = malloc(len);
if (buf == NULL)
errExit("malloc");
numRead = read(fd, buf, len);
if (numRead == -1)
errExit("read");
if (numRead == 0)
printf("%s: end-of-file\n", argv[ap]);
else {
printf("%s: ", argv[ap]);
for (j = 0; j < numRead; j++)
if (argv[ap][0] == 'r')
printf("%c", isprint((unsigned char) buf[j]) ? buf[j] : '?');
else
printf("%02x ", (unsigned int) buf[j]);
printf("\n");
}
free(buf);
break;
case 'w': /* write string at current offset */
numWritten = write(fd, &argv[ap][1], strlen(&argv[ap][1]));
if (numWritten == -1)
errExit("write");
printf("%s: wrote %ld bytes\n", argv[ap], (long)numWritten);
break;
case 's': /* change file offset */
offset = getLong(&argv[ap][1], GN_ANY_BASE, argv[ap]);
if (lseek(fd, offset, SEEK_SET) == -1)
errExit("lseek");
printf("%s: seek succeeded\n", argv[ap]);
break;
default:
cmdLineErr("Argument must start with [rRws]: %s\n", argv[ap]);
}
}
exit(EXIT_SUCCESS);
}
输出:
wang@wang:~/test/tlpi-dist/lib$ touch tfile
wang@wang:~/test/tlpi-dist/lib$ ./seek_io tfile s100000 wabc
s100000: seek succeeded
wabc: wrote 3 bytes
wang@wang:~/test/tlpi-dist/lib$ ls -l tfile
-rw-rw-r-- 1 wang wang 100003 3月 6 09:52 tfile
wang@wang:~/test/tlpi-dist/lib$ ./seek_io tfile s10000 R5
s10000: seek succeeded
R5: 00 00 00 00 00
(11)对自定义函数errExit(char *str)的解读
/* Display error message including 'errno' diagnostic, and
96 terminate the process */
97
98 void
99 errExit(const char *format, ...)
100 {
101 va_list argList;
102
103 va_start(argList, format);
104 outputError(TRUE, errno, TRUE, format, argList);
105 va_end(argList);
106
107 terminate(TRUE);
108 }
不要深究其中的内容,提供函数使用就可以。
举例说明。
1 #include <fcntl.h>
2 #include "tlpi_hdr.h"
3
4 int main() {
5 int fd;
6 fd = open("/noexit/file", O_RDWR, 0666);
7 if (fd == -1)
8 errExit("open");
9 else
10 printf("open successfully");
11
12 exit(0);
13 }
输出:
wang@wang:~/test/tlpi-dist/lib$ gcc example.c error_functions.c -o example
wang@wang:~/test/tlpi-dist/lib$ ./example
ERROR [ENOENT No such file or directory] open
可以看出添加的字符串在errno后面。
(12) ioctl()函数
作用: The ioctl() system call is a general-purpose mechanism for performing file and device operation that fall outside the universal I/O model described before.
DESCRIPTION
The ioctl() function manipulates the underlying device parameters of special files. In particular, many operating characteristics of char‐
acter special files (e.g., terminals) may be controlled with ioctl() requests. The argument d must be an open file descriptor.
The second argument is a device-dependent request code. The third argument is an untyped pointer to memory. It's traditionally char *argp
(from the days before void * was valid C), and will be so named for this discussion.
An ioctl() request has encoded in it whether the argument is an in parameter or out parameter, and the size of the argument argp in bytes.
Macros and defines used in specifying an ioctl() request are located in the file <sys/ioctl.h>.
后面的章节将详细描述。
(13)小结
主要讲述,open(), read(), write(), close(), lseek(), ioctl()函数的使用。
In order to perform I/O on an regular file, we must first obtain a file descriptor using open(). I/O is then performed using read() and write(). After performing all I/O, we should free the file descriptor and its associated resources using close(). These system calls can be used to perform I/O on all types of files.
The fact that all file types and device drivers implement the same I/O interface allows for university of I/O, meaning that a program can typically be used with any type of file without requiring code that is specific to the file type.
For each open file, the kernel maintains a file offset, which determines the location at which the next read or write will occur. The file offset is implicitly updated by reads and writes. Using lseek(), we can explicitly reposition the file offset to any location within the file or past the end of the file. Writing data at a position beyond the previous end of the file creates a hole in the file. Reads from a file hole return bytes containing zeros.
The ioctl() system call is a catchall for device and file operations that don't fit into the standard file I/O model.
(14) 练习,写一个tee可执行程序,
该程序从标准输入中读取字符,写到标准输出中,同时写到文件里。
-a 参数可以使新写入的文件附加到原来文件的最后。
1 #include <stdio.h>
2 #include <unistd.h>
3 #include <fcntl.h>
4
5 int main(int argc, char *argv[]) {
6 int fd = 0;
7 ssize_t numRead;
8 char buf[50];
9 fd = open(argv[1], O_RDWR, 0666);
10 if (fd == -1)
11 errExit("open");
12 do {
13 numRead = read(0, buf, sizeof(buf));
14 write(fd, buf, numRead);
15 write(1, buf, numRead);
16 } while (numRead != 0);
17 close(fd);
18 return 0;
19 }
自己写的程序,对于需要解析-a,需要另外学习,Parsing Command-Line options。