xv6: a simple, Unix-like teaching operating system|Chapter 1 Operating system interfaces

最新推荐文章于 2024-06-28 16:55:29 发布

wichai515

最新推荐文章于 2024-06-28 16:55:29 发布

阅读量283

点赞数

文章标签： risc-v linux

本文链接：https://blog.csdn.net/wichai515/article/details/130160451

版权

Chapter 1 Operating system interfaces

1.0 Overview

The job of an operating system

How does OS interact with user programs?

XV6

Process

System calls

Syscalls Collections

The job of an operating system is to share a computer among multiple programs and to provide a more useful set of services than the hardware alone supports.
An operating system manages and abstracts the low-level hardware(管理和抽象), so that, for example, a word processor(字处理器) need not concern itself with which type of disk hardware is being used.
An operating system shares the hardware among multiple programs so that they run (or appear to run) at the same time.
Finally, operating systems provide controlled ways for programs to interact, so that they can share data or work together.

An operating system provides services to user programs through an interface. Designing a good interface turns out to be difficult.

On the one hand, we would like the interface to be simple and narrow because that makes it easier to get the implementation right.
On the other hand, we may be tempted to offer many sophisticated features to applications.The trick in resolving this tension is to design interfaces that rely on a few mechanisms that can be combined to provide much generality(通用性).

This book uses a single operating system as a concrete example to illustrate operating system concepts. That operating system, xv6, provides the basic interfaces introduced by Ken Thompson and Dennis Ritchie’s Unix operating system, as well as mimicking Unix’s internal design.

Unix provides a narrow interface whose mechanisms combine well, offering a surprising degree of generality.

This interface has been so successful that modern operating systems—BSD, Linux, macOS, Solaris, and even, to a lesser extent, Microsoft Windows—have Unix-like interfaces. Understanding xv6 is a good start toward understanding any of these systems and many others.

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-HiuQkRXm-1681471282287)(Chapter%201%20Operating%20system%20interfaces%203d5f6832710a4b05b469d872f128a86f/Untitled.png)]

As Figure 1.1 shows, xv6 takes the traditional form of a kernel, a special program that provides services to running programs.

Process:

Each running program, called a process, has memory containing instructions, data, and a stack.

The instructions implement the program’s computation.
The data are the variables on which the computation acts.
The stack(堆栈) organizes the program’s procedure calls(过程调用).

A given computer typically has many processes but only a single kernel.

System Call:

When a process needs to invoke a kernel service, it invokes a system call, one of the calls in the operating system’s interface. ( 2 steps )

The system call enters the kernel
The kernel performs the service and returns.

Thus a process alternates(交替) between executing in user space and kernel space.

Syscalls Collections:

The collection of system calls that a kernel provides is the interface that user programs see. The xv6 kernel provides a subset of the services and system calls that Unix kernels traditionally offer. Figure 1.2 lists all of xv6’s system calls.

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-f0bPx36A-1681471282287)(Chapter%201%20Operating%20system%20interfaces%203d5f6832710a4b05b469d872f128a86f/Untitled%201.png)]

System call	Description
int fork()	Create a process, return child’s PID.
int exit(int status)	Terminate the current process; status reported to wait(). No return.
int wait(int *status)	Wait for a child to exit; exit status in *status; returns child PID.
int kill(int pid)	Terminate process PID. Returns 0, or -1 for error.
int getpid()	Return the current process’s PID.
int sleep(int n)	Pause for n clock ticks.
int exec(char file, char argv[])	Load a file and execute it with arguments; only returns if error.
char *sbrk(int n)	Grow process’s memory by n bytes. Returns start of new memory.
int open(char *file, int flags)	Open a file; flags indicate read/write; returns an fd (file descriptor).
int write(int fd, char *buf, int n)	Write n bytes from buf to file descriptor fd; returns n.
int read(int fd, char *buf, int n	Read n bytes into buf; returns number read; or 0 if end of file.
int close(int fd)	Release open file fd.
int dup(int fd)	Return a new file descriptor referring to the same file as fd.
int pipe(int p[])	Create a pipe, put read/write file descriptors in p[0] and p[1].
int chdir(char *dir)	Change the current directory.
int mkdir(char *dir)	Create a new directory.
int mknod(char *file, int, int)	Create a device file.
int fstat(int fd, struct stat *st)	Place info about an open file into *st.
int stat(char file, struct stat st)	Place info about a named file into *st.
int link(char file1, char file2)	Create another name (file2) for the file file1.
int unlink(char *file)	Remove a file.

The rest of this chapter outlines xv6’s services—processes, memory, file descriptors, pipes, and a file system—and illustrates(说明) them with code snippets and discussions of how the shell, Unix’s command-line user interface, uses them. The shell’s use of system calls illustrates how carefully they have been designed.

Shell:

The shell is an ordinary program that reads commands from the user and executes them. The fact that the shell is a user program, and not part of the kernel, illustrates the power of the system call interface: there is nothing special about the shell. It also means that the shell is easy to replace; as a result, modern Unix systems have a variety of shells to choose from, each with its own user interface and scripting features. The xv6 shell is a simple implementation of the essence of the Unix Bourne shell. Its implementation can be found at [(user/sh.c:1)](https://www.notion.so/user-sh-c-3af9ee81a4524539bbf8b30137bfafee).

1.1 Process and memory

Process

Syscall fork

Syscall exit

Syscall wait

Code Example of fork, wait and exit

Syscall exec

Shell

User & Kernel space:An xv6 process consists of user-space memory (instructions, data, and stack) and per-process state private to the kernel(内核私有).
Xv6 time-shares processes: it transparently switches the available CPUs among the set of processes waiting to execute.(分时进程)
Saving data:When a process is not executing, xv6 saves its CPU registers, restoring them when it next runs the process.
PID:The kernel associates a process identifier, or PID, with each process.

A process may create a new process using the fork system call. **Fork gives the new process exactly the same memory contents (both instructions and data) as the calling process.**

Fork returns in both the original and new processes.

In the original process, fork returns the new process’s PID.
In the new process, fork returns zero.

The original and new processes are often called the parent and child.

The exit system call causes the calling process to stop executing and to release resources such as memory and open files.

Exit takes an integer status argument, conventionally 0 to indicate success and 1 to indicate failure.

The wait system call returns the PID of an exited (or killed) child of the current process and copies the exit status of the child to the address passed to wait;

If none of the caller’s children has exited, wait waits for one to do so.
If the caller has no children, wait immediately returns -1.
If the parent doesn’t care about the exit status of a child, it can pass a 0 address to wait.

For example, consider the following program fragment written in the C programming language :

int pid = fork();
if(pid > 0){
	printf("parent: child=%d\n", pid); 
	pid = wait((int * ) 0); 
	printf("child %d is done\n", pid);
} else if(pid == 0){
	printf("child: exiting\n");
	exit(0); 
} else {
	printf("fork error\n");
}

// output:
// parent: child=1234
// child: exiting
// parent: child 1234 is done

Code explanation:

In the example, the output lines might come out in either order (or even intermixed), depending on whether the parent or child gets to its print call first.

After the child exits, the parent’s wait returns, causing the parent to print parent: child 1234 is done.
Although the child has the same memory contents as the parent initially, the parent and child are executing with different memory and different registers: changing a variable in one does not affect the other.
For example, when the return value of wait is stored into pid in the parent process, it doesn’t change the variable pid in the child. The value of pid in the child will still be zero.

The exec system call replaces the calling process’s memory with a new memory image(新的内存映像) loaded from a file stored in the file system.

ELF format:

The file must have a particular format(格式), which specifies which part of the file holds instructions, which part is data, at which instruction to start, etc.

Xv6 uses the ELF format, which Chapter 3 discusses in more detail.

When exec succeeds, it does not return to the calling program; instead, the instructions loaded from the file start executing at the entry point declared in the ELF header.

Code explanation:

Exec takes two arguments: the name of the file containing the executable and an array of string arguments. For example:

char *argv[3];
argv[0] = "echo";
argv[1] = "hello";
argv[2] = 0;
exec("/bin/echo", argv); 
printf("exec error\n");

This fragment(片段) replaces the calling program with an instance(实例) of the program /bin/echo running with the argument list echo hello.
Most programs ignore the first element of the argument array, which is conventionally the name of the program.

Shell 是怎么工作的？

The xv6 shell uses the above calls to run programs on behalf of users.

The main structure of the shell is simple; see main (user/sh.c:145).
The main loop reads a line of input from the user with getcmd.
Then it calls fork, which creates a copy of the shell process.
The parent calls wait, while the child runs the command.

If the user had typed “echo hello” to the shell, runcmd would have been called with “echo hello” as the argument.

runcmd (user/sh.c:58) runs the actual command. For “echo hello”, it would call exec (user/sh.c:78).

If exec succeeds then the child will execute instructions from echo instead of runcmd. (子进程执行echo)

At some point echo will call exit, which will cause the parent to return from wait in main (user/sh.c:145).

We will see later that the shell exploits(利用) the separation in its implementation of I/O redirection(重定向).

To avoid the wastefulness of creating a duplicate process and then immediately replacing it (with exec), operating kernels optimize the implementation of fork for this use case by using virtual memory techniques such as copy-on-write (see Section 4.6).

The shell is an ordinary program that reads commands from the user and executes them.

The fact that the shell is a user program, and not part of the kernel, illustrates the power of the system call interface: there is nothing special about the shell.
It also means that the shell is easy to replace. As a result, modern Unix systems have a variety of shells to choose from, each with its own user interface and scripting features.

The xv6 shell is a simple implementation of the essence of the Unix Bourne shell. And it uses the above calls(exec/fork/exit/wait) to run programs on behalf of users. The main structure of the shell is simple, see main (user/sh.c:145).

Xv6 allocates most user-space memory implicitly:

fork allocates the memory required for the child’s copy of the parent’s memory.
exec allocates enough memory to hold the executable file.

A process that needs more memory at run-time (perhaps for malloc) can call sbrk(n) to grow its data memory by n bytes

sbrk returns the location of the new memory.

1.2 I/O and File descriptors

File descriptors —— 文件描述符

File descriptors

How to obtain File Descriptors

Features

Syscall read & write

program cat

Syscall close

Syscall open

Syscall dup

A file descriptor is a small integer representing a kernel-managed object that a process may read from or write to.

A process may obtain a file descriptor by opening a file, directory, or device, or by creating a pipe, or by duplicating an existing descriptor.For simplicity we’ll often refer to the object a file descriptor refers to as a “file”;

Abstract:The file descriptor interface abstracts away the differences between files, pipes, and devices, making them all look like streams of bytes. We’ll refer to input and output as I/O.
Private For Each Process: Internally, the xv6 kernel uses the ﬁle descriptor as an index into a per-process table, so that every process has a private space of ﬁle descriptors starting at zero.

By convention(约定), a process reads from ﬁle descriptor 0 (standard input), writes output to ﬁle descriptor 1 (standard output), and writes error messages to ﬁle descriptor 2 (standard error).

As we will see, the shell exploits the convention to implement I/O redirection and pipelines.

The shell ensures that it always has three ﬁle descriptors open (user/sh.c:151), which are by default ﬁle descriptors for the console.

The read and write system calls read bytes from and write bytes to open files named by file descriptors.

Each file descriptor that refers to a file has an offset associated with it(与之关联的偏移量).

read: The call read(fd, buf, n) reads at most n bytes from the file descriptor fd, copies them into buf, and returns the number of bytes read.

Read reads data from the current file offset and then advances that offset by the number of bytes read:
a subsequent read will return the bytes following the ones returned by the first read.
When there are no more bytes to read, read returns zero to indicate(指出/表明) the end of the file.

write:The call write(fd, buf, n) writes n bytes from buf to the file descriptor fd and returns the number of bytes written.

Fewer than n bytes are written only when an error occurs.

Like read,write writes data at the current file offset and then advances that offset by the number of bytes written: each write picks up where the previous one left off(从前一次停止的地方开始).

The use of file descriptors and the convention that file descriptor 0 is input and file descriptor 1 is output allows a simple implementation of cat.(文件描述符的使用以及文件描述符0是输入而文件描述符1是输出的约定允许cat的简单实现。)

Code explanation: cat

The following program fragment (which forms the essence of the program cat) copies data from its standard input to its standard output. If an error occurs, it writes a message to the standard error.

Linux cat: View content

char buf[512]; 
int n;

for(;;){
	n = read(0, buf, sizeof(buf)); 
	if(n == 0)
		break; 
	if(n < 0){
		fprintf(2, "read error\n");
		exit(1);
	} 
	if(write(1, buf, n) != n){
		fprintf(2, "write error\n");
		exit(1); 
	}
}

The important thing to note in the code fragment is that cat doesn’t know whether it is reading from a file, console, or a pipe.

Similarly cat doesn’t know whether it is printing to a console, a file, or whatever.

The close system call releases a file descriptor, making it free for reuse by a future open,pipe, or dup system call (see below).

A newly allocated file descriptor is always the lowestnumbered unused descriptor of the current process.

The system call open is used to open or create a file for reading or writing.

The second argument to open consists of a set of ﬂags, expressed as bits, that control what open does.

The possible values are deﬁned in the ﬁle control (fcntl) header (kernel/fcntl.h:1-5): O_RDONLY, O_WRONLY, O_RDWR, O_CREATE, and O_TRUNC, which instruct open to open the ﬁle for reading, or for writing, or for both reading and writing, to create the ﬁle if it doesn’t exist, and to truncate the ﬁle to zero length.

The dup system call duplicates an existing file descriptor, returning a new one that refers to the same underlying I/O object. (dup系统调用复制一个现有的文件描述符，返回一个引用相同底层I/O对象的新描述符。)

Both file descriptors share an offset(偏移量), just as the file descriptors duplicated by fork do. This is another way to write hello world into a file:

fd = dup(1); 
write(1, "hello ", 6); 
write(fd, "world\n", 6);

Otherwise ﬁle descriptors do not share offsets, even if they resulted from open calls for the same ﬁle.

Dup allows shells to implement commands like this:ls existing-file non-existing-file > tmp1 2>&1.

The 2>&1 tells the shell to give the command a file descriptor 2 that is a duplicate of descriptor 1.

Both the name of the existing file and the error message for the non-existing file will show up in the file tmp1.

offset is 偏移量

I/O Redirection

File descriptors are a powerful abstraction, because they hide the details of what they are connected to: a process writing to file descriptor 1 may be writing to a file, to a device like the console, or to a pipe.

File descriptors and fork interact to make I/O redirection easy to implement. (fd + fork ➡️ I/O redirection)

Fork copies the parent’s file descriptor table along with its memory, so that the child starts with exactly the same open files as the parent.
The system call exec replaces the calling process’s memory but preserves its file table. (替换调用进程的内存，但保留其文件表)

This behavior allows the shell to implement I/O redirection by forking, reopening chosen file descriptors in the child, and then calling exec to run the new program.

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Up1qt84S-1681471282288)(Chapter%201%20Operating%20system%20interfaces%203d5f6832710a4b05b469d872f128a86f/Untitled%202.png)]

Example:输入重定向

Here is a simplified version of the code a shell runs for the command cat < input.txt:

char *argv[2];

argv[0] = "cat";
argv[1] = 0;
if(fork() == 0) {
	close(0); //子进程关闭fd 0
	open("input.txt", O_RDONLY); 
	exec("cat", argv); 
}

Code Explanation:

After the child closes file descriptor 0, open is guaranteed to use that file descriptor for the newly opened input.txt: 0 will be the smallest available file descriptor.
Cat then executes with file descriptor 0 (standard input) referring to input.txt.The parent process’s file descriptors are not changed by this sequence, since it modifies only the child’s descriptors.

File Descriptor between Parent and Child Process

Although fork copies the file descriptor table, each underlying file offset is shared between parent and child.

Consider this example:

if(fork() == 0) { 
	write(1, "hello ", 6); 
	exit(0); 
} else { 
	wait(0); 
	write(1, "world\n", 6); 
}

Code Explanation:

At the end of this fragment, the file attached to file descriptor 1 will contain the data hello world.
The write in the parent (which, thanks to wait, runs only after the child is done) picks up where the child’s write left off.
This behavior helps produce sequential(顺序) output from sequences of shell commands, like (echo hello; echo world) >output.txt.

I/O Redirection In Shell

The code for I/O redirection in the xv6 shell works in exactly this way (user/sh.c:82). Recall that at this point in the code the shell has already forked the child shell and that runcmd will call exec to load the new program.

Now it should be clear why it is helpful that fork and exec are separate calls: between the two, the shell has a chance to redirect the child’s I/O without disturbing the I/O setup of the main shell.

One could instead imagine a hypothetical combined forkexec system call, but the options for doing I/O redirection with such a call seem awkward:

The shell could modify its own I/O setup before calling forkexec (and then un-do those modiﬁcations); or forkexec could take instructions for I/O redirection as arguments; or (least attractively) every program like cat could be taught to do its own I/O redirection

File Descriptor

整数标识符：文件描述符通常是一个非负整数，它在操作系统内部与一个打开的文件或I/O设备关联。这个整数值在同一个进程内具有唯一性。
文件描述符表：每个进程都有一个文件描述符表，它是一个记录文件描述符及其关联文件或设备信息的数据结构。当进程打开一个文件或设备时，操作系统会在进程的文件描述符表中分配一个新的条目，然后返回对应的文件描述符。
资源类型：文件描述符可以关联到不同类型的资源，例如普通文件、目录、管道（pipe）、套接字（socket）和设备等。
标准文件描述符：类Unix系统通常为每个进程预定义三个标准文件描述符，分别是：
- 标准输入（Standard Input，STDIN，文件描述符为0）：通常与键盘关联。
- 标准输出（Standard Output，STDOUT，文件描述符为1）：通常与显示器关联，用于输出信息。
- 标准错误（Standard Error，STDERR，文件描述符为2）：通常与显示器关联，用于输出错误信息。
系统调用：操作系统提供了一系列系统调用，允许进程对文件描述符进行操作，例如打开、关闭、读取和写入文件或设备。常见的系统调用有：
- open：打开一个文件或设备，返回一个文件描述符。
- read：从一个文件描述符关联的文件或设备中读取数据。
- write：将数据写入一个文件描述符关联的文件或设备。
- close：关闭一个文件描述符，释放相关资源。
- dup：复制一个文件描述符，创建一个新的文件描述符，指向同一个文件或设备。

1.3 Pipes

**答：

首先这几个linux命令都与I/O Redirection相关。
其中>>和>都属于输出重定向，<属于输入重定向，而|则是管道**。其中，>会覆盖目标的原有内容。当文件存在时会先删除原文件，再重新创建文件，然后把内容写入该文件，否则直接创建文件。而>>会在目标原有内容后追加内容。当文件存在时直接在文件未尾进行内容追加，不会删除原文件，否则直接创建文件。

Basic Concepts of pipe

A pipe is a small kernel buffer exposed to processes as a pair of file descriptors (fd), one for reading and one for writing.

Writing data to one end of the pipe makes that data available for reading from the other end of the pipe. Pipes provide a way for processes to communicate.

pipe的扩展讲解（必看之）

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-SbBtAwCe-1681471282289)(Chapter%201%20Operating%20system%20interfaces%203d5f6832710a4b05b469d872f128a86f/Untitled%203.png)]

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-uj6BqhyT-1681471282290)(Chapter%201%20Operating%20system%20interfaces%203d5f6832710a4b05b469d872f128a86f/Untitled%204.png)]

Example with `wc`

The following example code runs the program wc (word count)with standard input connected to the read end of a pipe.

int p[2]; 
char * argv[2];

argv[0] = "wc";
argv[1] = 0;

pipe(p);
// p[0] read side
// p[1] write side
if(fork() == 0) {
	// child process 
	close(0);
	dup(p[0]);
	close(p[0]);
	close(p[1]);
	exec("/bin/wc", argv); 
} else { 
	// parent process
	close(1);
  dup(p[1]);
	close(p[0]);
	close(p[1]);
	write(1, "hello world\n", 12);
}

Code Explanation:

The program calls pipe, which creates a new pipe and records the read and write file descriptors in the array p.
After fork, both parent and child have file descriptors referring to the pipe.
The child calls close and dup to make file descriptor zero refer to the read end of the pipe, closes the file descriptors in p, and calls exec to run wc.
When wc reads from its standard input, it reads from the pipe.
The parent closes the read side of the pipe, writes to the pipe, and then closes the write side.

Code of pipe

The xv6 shell implements pipelines such as grep fork sh.c | wc -l in a manner similar to the above code (user/sh.c:100).

|左右两个命令实际上是两个child processes，而|正好就充当了进程间通信的管道pipe，

  case PIPE:
      pcmd = (struct pipecmd*)cmd;
      if(pipe(p) < 0)
        panic("pipe");
      if(fork1() == 0){
        close(1);
        dup(p[1]);
        close(p[0]);
        close(p[1]);
        runcmd(pcmd->left);
      }
      if(fork1() == 0){
        close(0);
        dup(p[0]);
        close(p[0]);
        close(p[1]);
        runcmd(pcmd->right);
      }
      close(p[0]);
      close(p[1]);
      wait(0);
      wait(0);
      break;

Code Explanation:

The child process creates a pipe to connect the left end of the pipeline with the right end.
Then it calls fork and runcmd for the left end of the pipeline and fork and runcmd for the right end, and waits for both to ﬁnish.
The right end of the pipeline may be a command that itself includes a pipe (e.g., a | b | c), which itself forks two new child processes (one for b and one for c).
Thus, the shell may create a tree of processes. The leaves of this tree are commands and the interior nodes are processes that wait until the left and right children complete.

Another Implementation of pipe

In principle, one could have the interior nodes run the left end of a pipeline, but doing so correctly would complicate the implementation.

Consider making just the following modiﬁcation.

Change sh.c to not fork for p->left and run runcmd(p->left) in the interior process.
Then, for example, echo hi | wc won’t produce output, because when echo hi exits in runcmd, the interior process exits and never calls fork to run the right end of the pipe. This incorrect behavior could be ﬁxed by not calling exit in runcmd for interior processes, but this ﬁx complicates the code: now runcmd needs to know if it’s in an interior process or not.
Complications also arise when not forking for runcmd(p->right). For example, with just that modiﬁcation, sleep 10 | echo hi will immediately print “hi’ and a new prompt, instead of after 10 seconds; this happens because echo runs immediately and exits, not waiting for sleep to ﬁnish. Since the goal of the sh.c is to be as simple as possible, it doesn’t try to avoid creating interior processes.

Pipe VS Temporary File

Pipes may seem no more powerful than temporary ﬁles: the pipeline echo hello world | wc could be implemented without pipes as echo hello world >/tmp/xyz; wc </tmp/xyz

Pipes have at least four advantages over temporary ﬁles in this situation:

Self-Cleaning: First, pipes automatically clean themselves up. With the ﬁle redirection, a shell would have to be careful to remove /tmp/xyz when done.
Arbitrarily Long Streams. Second, pipes can pass arbitrarily long streams of data, while ﬁle redirection requires enough free space on disk to store all the data.
Parallel Execution: Third, pipes allow for parallel execution of pipeline stages, while the ﬁle approach requires the ﬁrst program to ﬁnish before the second starts.
Inter-process communication. Fourth, if you are implementing inter-process communication, pipes’ blocking reads and writes are more efﬁcient than the non-blocking semantics of ﬁles.

1.4 File system

1.4.1 Basic Concepts

File: The xv6 ﬁle system provides data ﬁles, which contain uninterpreted byte arrays.

Directory: Directories contain named references to data ﬁles and other directories. The directories form a tree, starting at a special directory called the root.

Inode: A ﬁle’s name is distinct from the ﬁle itself, the same underlying ﬁle, called an inode, can have multiple names, called links;

Each link consists of an entry(条目) in a directory, the entry contains a ﬁle name and a reference to an inode.
An inode holds metadata(元数据) about a ﬁle, including its type (ﬁle or directory or device), its length, the location of the ﬁle’s content on disk, and the number of links to a ﬁle.

Absolute Path:A path like /a/b/c refers to the file or directory named c inside the directory named b inside the directory named a in the root directory /.

Relative Path:Paths that don’t begin with / are evaluated relative to the calling process’s current directory, which can be changed with the chdir system call.

Both these code fragments open the same file (assuming all the directories involved(涉及) exist):

chdir("/a"); 
chdir("b"); 
open("c", O_RDONLY);

open("/a/b/c", O_RDONLY);

Code Explanation:

The first fragment changes the process’s current directory to /a/b.
The second neither refers to nor changes the process’s current directory.

1.4.2 Syscalls for creations

There are system calls to create new files and directories:

mkdir creates a new directory,
openwith the O_CREATE flag creates a new data file,
and mknod creates a new device file.

This example illustrates all three:

mkdir("/dir"); 
fd = open("/dir/file", O_CREATE|O_WRONLY); //?
close(fd); 
mknod("/console", 1, 1);

syscall `mknod`

Mknod creates a special file that refers to a device. Associated with a device file are the major and minor device numbers (the two arguments to mknod), which uniquely identify a kernel device.

When a process later opens a device file, the kernel diverts(转移) read and write system calls to the kernel device implementation instead of passing them to the file system.

syscall `fstat`

The fstat system call retrieves(引用) information from the inode that a file descriptor refers to. It fills in a struct stat, defined in stat.h (kernel/stat.h) as:

#define T_DIR 1     // Directory
#define T_FILE 2    // File
#define T_DEVICE 3  // Device

struct stat { 
	int dev;     // File system’s disk device 
	uint ino;    // Inode number 
	short type;  // Type of file 
	short nlink; // Number of links to file 
	uint64 size; // Size of file in bytes 
};

syscall `link`

The link system call creates another file system name referring to the same inode as an existing file.

This fragment creates a new file named both a and b.

open("a", O_CREATE|O_WRONLY); 
link("a", "b");

Reading from or writing to a is the same as reading from or writing to b. Each inode is identified by a unique inode number.

After the code sequence above, it is possible to determine that a and b refer to the same underlying(底层) contents by inspecting(检查) the result of fstat: both will return the same inode number (ino), and the nlink count will be set to 2.

syscall `unlink`

The unlink system call removes a name from the file system.

The file’s inode and the disk space holding its content are only freed when the file’s link count is zero and no file descriptors refer to it.

Thus adding unlink("a");to the last code sequence leaves the inode and file content accessible as b.

Furthermore is an idiomatic(惯用) way to create a temporary inode with no name that will be cleaned up when the process closes fd or exits.

fd = open("/tmp/xyz", O_CREATE|O_RDWR); 
unlink("/tmp/xyz");

1.4.2 File Utilities In Unix

Unix provides file utilities callable from the shell as user-level programs, for example mkdir,ln, and rm.This design allows anyone to extend the command-line interface by adding new userlevel programs. In hindsight(事后看来) this plan seems obvious, but other systems designed at the time of Unix often built such commands into the shell (and built the shell into the kernel).

One exception is cd, which is built into the shell (user/sh.c:160).

cd must change the current working directory of the shell itself.If cd were run as a regular command, then the shell would fork a child process, the child process would run cd, and cd would change the child 's working directory. The parent’s (i.e., the shell’s) working directory would not change.

1.5 Real world

Shell and Scripting Language

Unix’s combination of “standard” ﬁle descriptors, pipes, and convenient shell syntax for operations on them was a major advance in writing general-purpose reusable programs. The idea sparked a culture of “software tools” that was responsible for much of Unix’s power and popularity, and the shell was the ﬁrst so-called “scripting language”.

Unix System Call Interface in xv6 and Modern Kernels

The Unix system call interface persists today in systems like BSD, Linux, and macOS. The Unix system call interface has been standardized through the Portable Operating System Interface (POSIX) standard.

Xv6 is not POSIX compliant: it is missing many system calls (including basic ones such as lseek), and many of the system calls it does provide differ from the standard.

Our main goals for xv6 are simplicity and clarity while providing a simple UNIX-like system-call interface. Several people have extended xv6 with a few more system calls and a simple C library in order to run basic Unix programs.

Modern kernels, however, provide many more system calls, and many more kinds of kernel services, than xv6. For example, they support networking, windowing systems, user-level threads, drivers for many devices, and so on. Modern kernels evolve continuously and rapidly, and offer many features beyond POSIX.

Everything Is A File

Unix uniﬁed access to multiple types of resources (ﬁles, directories, and devices) with a single set of ﬁle-name and ﬁle-descriptor interfaces. This idea can be extended to more kinds of resources; a good example is Plan 9, which applied the “resources are ﬁles” concept to networks, graphics, and more. However, most Unix-derived operating systems have not followed this route.

Other Design of File System

The ﬁle system and ﬁle descriptors have been powerful abstractions. Even so, there are other models for operating system interfaces. Multics, a predecessor of Unix, abstracted ﬁle storage in a way that made it look like memory, producing a very different ﬂavor of interface. The complexity of the Multics design had a direct inﬂuence on the designers of Unix, who tried to build something simpler.

What xv6 Is About

Xv6 does not provide a notion of users or of protecting one user from another; in Unix terms, all xv6 processes run as root.

This book examines how xv6 implements its Unix-like interface, but the ideas and concepts apply to more than just Unix. Any operating system must multiplex processes onto the underlying hardware, isolate processes from each other, and provide mechanisms for controlled inter-process communication. After studying xv6, you should be able to look at other, more complex operating systems and see the concepts underlying xv6 in those systems as well.

wichai515

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
xv6: a simple, Unix-like teaching operating system|Chapter 1 Operating system interfaces

ch1
复制链接

扫一扫

xv6: a simple, Unix-like teaching operating system|Chapter 1 Operating system interfaces

Chapter 1 Operating system interfaces

1.0 Overview

1.1 Process and memory

1.2 I/O and File descriptors

I/O Redirection

File Descriptor between Parent and Child Process

File Descriptor

1.3 Pipes

Basic Concepts of pipe

Example with wc

Code of pipe

Another Implementation of pipe

Pipe VS Temporary File

1.4 File system

1.4.1 Basic Concepts

1.4.2 Syscalls for creations

syscall mknod

syscall fstat

syscall link

syscall unlink

1.4.2 File Utilities In Unix

1.5 Real world

“相关推荐”对你有帮助么？

Example with `wc`

syscall `mknod`

syscall `fstat`

syscall `link`

syscall `unlink`