1.5.6. File-Handling System Calls
When a user accesses the contents of either a regular file or a directory,he actually accesses some data stored in hardware block device. In the sence , a filesystem is a user-lever view of the physical organization of a hard disk partition.Because a process in the User Mode cannnot directly interact with the low-lever hardware components,each actual file operation must performed in Kernel Mode. Therefore, the Unix operating system defines several system calls related to file handing.
All Unix kernels devote great attention to the efficient handling of hardware block devices to achieve good overall system performance. In the chapters that follow, we will describe topics related to file handling in Linux and specifically how the kernel reacts to file-related system calls. To understand those descriptions, you will need to know how the main file-handling system calls are used; these are described in the next section.
1.5.6.1. Opening a file
Processes can access only "opened" files. To open a file, the process invokes the system call:
fd = open(path, flag, mode)
The three parameters have the following meanings:
-
Denotes the pathname (relative or absolute) of the file to be opened.
-
Specifies how the file must be opened (e.g., read, write, read/write, append). It also can specify whether a nonexisting file should be created.
-
Specifies the access rights of a newly created file.
path
flag
mode
This system call creates an "open file" object and returns an identifier called a file descriptor. An open file object contains:
-
Some file-handling data structures, such as a set of flags specifying how the file has been opened, an offset field that denotes the current position in the file from which the next operation will take place (the so-called file pointer), and so on.
-
Some pointers to kernel functions that the process can invoke. The set of permitted functions depends on the value of the flag parameter.
We discuss open file objects in detail in Chapter 12. Let's limit ourselves here to describing some general properties specified by the POSIX semantics.
-
A file descriptor represents an interaction between a process and an opened file, while an open file object contains data related to that interaction. The same open file object may be identified by several file descriptors in the same process.
-
Several processes may concurrently open the same file. In this case, the filesystem assigns a separate file descriptor to each file, along with a separate open file object. When this occurs, the Unix filesystem does not provide any kind of synchronization among the I/O operations issued by the processes on the same file. However, several system calls such as flock( ) are available to allow processes to synchronize themselves on the entire file or on portions of it (see Chapter 12).
To create a new file, the process also may invoke the creat( ) system call, which is handled by the kernel exactly like open( ).
1.5.6.2. Accessing an opened file
Regular Unix files can be addressed either sequentially or randomly, while device files and named pipes are usually accessed sequentially. In both kinds of access, the kernel stores the file pointer in the open file object that is, the current position at which the next read or write operation will take place.
Sequential access is implicitly assumed: the read( ) and write( ) system calls always refer to the position of the current file pointer. To modify the value, a program must explicitly invoke the lseek( ) system call. When a file is opened, the kernel sets the file pointer to the position of the first byte in the file (offset 0).
The lseek( ) system call requires the following parameters:
newoffset = lseek(fd, offset, whence);
which have the following meanings:
-
Indicates the file descriptor of the opened file
-
Specifies a signed integer value that will be used for computing the new position of the file pointer
-
Specifies whether the new position should be computed by adding the offset value to the number 0 (offset from the beginning of the file), the current file pointer, or the position of the last byte (offset from the end of the file)
fd
offset
whence
The read( ) system call requires the following parameters:
nread = read(fd, buf, count);
which have the following meanings:
-
Indicates the file descriptor of the opened file
-
Specifies the address of the buffer in the process's address space to which the data will be transferred
-
Denotes the number of bytes to read
fd
buf
count
When handling such a system call, the kernel attempts to read count bytes from the file having the file descriptor fd, starting from the current value of the opened file's offset field. In some casesend-of-file, empty pipe, and so onthe kernel does not succeed in reading all count bytes. The returned nread value specifies the number of bytes effectively read. The file pointer also is updated by adding nread to its previous value. The write( ) parameters are similar.
1.5.6.3. Closing a file
When a process does not need to access the contents of a file anymore, it can invoke the system call:
res = close(fd);
which releases the open file object corresponding to the file descriptor fd. When a process terminates, the kernel closes all its remaining opened files.
1.5.6.4. Renaming and deleting a file
To rename or delete a file, a process does not need to open it. Indeed, such operations do not act on the contents of the affected file, but rather on the contents of one or more directories. For example, the system call:
res = rename(oldpath, newpath);
changes the name of a file link, while the system call:
res = unlink(pathname);
decreases the file link count and removes the corresponding directory entry. The file is deleted only when the link count assumes the value 0.