Chapter 1. Introduction to the Linux Kernel
After three decades of use, the unix operating system is still regarded as one of the most powerful and elegant systems in existence. Since the creation of Unix in 1969, the brainchild of Dennis Ritchie and Ken Thompson has become a creature of legends, a system whose design has withstood the test of time with few bruises to its name.
Unix grew out of Multics, a failed multiuser operating system project in which Bell Laboratories was involved. With the Multics project terminated, members of Bell Laboratories' Computer Sciences Research Center were left without a capable interactive operating system. In the summer of 1969, Bell Lab programmers sketched out a file system design that ultimately evolved into Unix. Testing their design, Thompson implemented the new system on an otherwise idle PDP-7. In 1971, Unix was ported to the PDP-11, and in 1973, the operating system was rewritten in C, an unprecedented step at the time, but one that paved the way for future portability. The first Unix widely used outside of Bell Labs was Unix System, Sixth Edition, more commonly called V6.
Other companies ported Unix to new machines. Accompanying these ports were enhancements that resulted in several variants of the operating system. In 1977, Bell Labs released a combination of these variants into a single system, Unix System III; in 1982, AT&T released System V[1].
[1] What about System IV? The rumor is it was an internal development version.
The simplicity of Unix's design, coupled with the fact that it was distributed with source code, led to further development at outside organizations. The most influential of these contributors was the University of California at Berkeley. Variants of Unix from Berkeley are called Berkeley Software Distributions (BSD). The first Berkeley Unix was 3BSD in 1979. A series of 4BSD releases, 4.0BSD, 4.1BSD, 4.2BSD, and 4.3BSD, followed 3BSD. These versions of Unix added virtual memory, demand paging, and TCP/IP. In 1993, the final official Berkeley Unix, featuring a rewritten VM, was released as 4.4BSD. Today, development of BSD continues with the Darwin, Dragonfly BSD, FreeBSD, NetBSD, and OpenBSD systems.
In the 1980s and 1990s, multiple workstation and server companies introduced their own commercial versions of Unix. These systems were typically based on either an AT&T or Berkeley release and supported high-end features developed for their particular hardware architecture. Among these systems were Digital's Tru64, Hewlett Packard's HP-UX, IBM's AIX, Sequent's DYNIX/ptx, SGI's IRIX, and Sun's Solaris.
The original elegant design of the Unix system, along with the years of innovation and evolutionary improvement that followed, have made Unix a powerful, robust, and stable operating system. A handful of characteristics of Unix are responsible for its resilience. First, Unix is simple: Whereas some operating systems implement thousands of system calls and have unclear design goals, Unix systems typically implement only hundreds of system calls and have a very clear design. Next, in Unix, everything is a file[2]. This simplifies the manipulation of data and devices into a set of simple system calls: open(), read(), write(), ioctl(), and close(). In addition, the Unix kernel and related system utilities are written in Ca property that gives Unix its amazing portability and accessibility to a wide range of developers. Next, Unix has fast process creation time and the unique fork() system call. This encourages strongly partitioned systems without gargantuan multi-threaded monstrosities. Finally, Unix provides simple yet robust interprocess communication (IPC) primitives that, when coupled with the fast process creation time, allow for the creation of simple utilities that do one thing and do it well, and that can be strung together to accomplish more complicated tasks.
[2] Well, okay, not everythingbut much is represented as a file. Modern operating systems, such as Unix's successor at Bell Labs, Plan9, implement nearly everything as a file.
Today, Unix is a modern operating system supporting multitasking, multithreading, virtual memory, demand paging, shared libraries with demand loading, and TCP/IP networking. Many Unix variants scale to hundreds of processors, whereas other Unix systems run on small, embedded devices. Although Unix is no longer a research project, Unix systems continue to benefit from advances in operating system design while they remain practical and general-purpose operating systems.
Unix owes its success to the simplicity and elegance of its design. Its strength today lies in the early decisions that Dennis Ritchie, Ken Thompson, and other early developers made: choices that have endowed Unix with the capability to evolve without compromising itself.
Along Came Linus: Introduction to Linux
[3] I will leave the free versus open debate to you. See http://www.fsf.org and http://www.opensource.org.
[4] You should probably read the GNU GPL version 2.0 if you have not. There is a copy in the file COPYING in your kernel source tree. You can also find it online at http://www.fsf.org.
Overview of Operating Systems and Kernels
Because of the ever-growing feature set and ill design of some modern commercial operating systems, the notion of what precisely defines an operating system is vague. Many users consider whatever they see on the screen to be the operating system. Technically speaking, and in this book, the operating system is considered the parts of the system responsible for basic use and administration. This includes the kernel and device drivers, boot loader, command shell or other user interface, and basic file and system utilities. It is the stuff you neednot a web browser or music players. The term system, in turn, refers to the operating system and all the applications running on top of it.Of course, the topic of this book is the kernel. Whereas the user interface is the outermost portion of the operating system, the kernel is the innermost. It is the core internals; the software that provides basic services for all other parts of the system, manages hardware, and distributes system resources. The kernel is sometimes referred to as the supervisor, core, or internals of the operating system. Typical components of a kernel are interrupt handlers to service interrupt requests, a scheduler to share processor time among multiple processes, a memory management system to manage process address spaces, and system services such as networking and interprocess communication. On modern systems with protected memory management units, the kernel typically resides in an elevated system state compared to normal user applications. This includes a protected memory space and full access to the hardware. This system state and memory space is collectively referred to as kernel-space. Conversely, user applications execute in user-space. They see a subset of the machine's available resources and are unable to perform certain system functions, directly access hardware, or otherwise misbehave (without consequences, such as their death, anyhow). When executing the kernel, the system is in kernel-space executing in kernel mode, as opposed to normal user execution in user-space executing in user mode. Applications running on the system communicate with the kernel via system calls (see Figure 1.1). An application typically calls functions in a libraryfor example, the C librarythat in turn rely on the system call interface to instruct the kernel to carry out tasks on their behalf. Some library calls provide many features not found in the system call, and thus, calling into the kernel is just one step in an otherwise large function. For example, consider the familiar printf() function. It provides formatting and buffering of the data and only eventually calls write() to write the data to the console. Conversely, some library calls have a one-to-one relationship with the kernel. For example, the open() library function does nothing except call the open() system call. Still other C library functions, such as strcpy(), should (you hope) make no use of the kernel at all. When an application executes a system call, it is said that the kernel is executing on behalf of the application. Furthermore, the application is said to be executing a system call in kernel-space, and the kernel is running in process context. This relationshipthat applications call into the kernel via the system call interfaceis the fundamental manner in which applications get work done.Figure 1.1. Relationship between applications, the kernel, and hardware.
The kernel also manages the system's hardware. Nearly all architectures, including all systems that Linux supports, provide the concept of interrupts. When hardware wants to communicate with the system, it issues an interrupt that asynchronously interrupts the kernel. Interrupts are identified by a number. The kernel uses the number to execute a specific interrupt handler to process and respond to the interrupt. For example, as you type, the keyboard controller issues an interrupt to let the system know that there is new data in the keyboard buffer. The kernel notes the interrupt number being issued and executes the correct interrupt handler. The interrupt handler processes the keyboard data and lets the keyboard controller know it is ready for more data. To provide synchronization, the kernel can usually disable interruptseither all interrupts or just one specific interrupt number. In many operating systems, including Linux, the interrupt handlers do not run in a process context. Instead, they run in a special interrupt context that is not associated with any process. This special context exists solely to let an interrupt handler quickly respond to an interrupt, and then exit.These contexts represent the breadth of the kernel's activities. In fact, in Linux, we can generalize that each processor is doing one of three things at any given moment:-
In kernel-space, in process context, executing on behalf of a specific process
-
In kernel-space, in interrupt context, not associated with a process, handling an interrupt
-
In user-space, executing user code in a process
-