Linux kernel architecture

原文地址:Introduction — The Linux Kernel documentation (linux-kernel-labs.github.io)

目录

  • Linux kernel architecture
    • Typical operating system architecture
      • Monolithic kernel
      • Micro kernel
      • Micro-kernels vs monolithic kernels
    • Architecture
    • Device drivers
    • Process management
    • Memory management
    • Block I/O management
    • Virtual Filesystem Switch
    • Networking stack
    • Linux Security Modules 
    • 附注
      • kernel stack


Typical operating system architecture

In the typical operating system architecture (see the figure below) the operating system kernel is responsible for access and sharing the hardware in a secure and fair manner with multiple applications.

../_images/ditaa-48374873962ca32ada36c14ab9a83b60f112a1e0.png

The kernel offers a set of APIs that applications issue which are generally referred to as "System Calls". These APIs are different from regular library APIs because they are the boundary at which the execution mode switch from user mode to kernel mode.

In order to provide application compatibility, system calls are rarely changed. Linux particularly enforces this (as opposed to in kernel APIs that can change as needed).

The kernel code itself can be logically separated in core kernel code and device drivers code. Device drivers code is responsible of accessing particular devices while the core kernel code is generic. The core kernel can be further divided into multiple logical subsystems (e.g. file access, networking, process management, etc.)

Monolithic kernel

A monolithic kernel is one where there is no access protection between the various kernel subsystems and where public functions can be directly called between various subsystems.

../_images/ditaa-3dc899167df5e16a230c434cf5d6964cb5868482.png

However, most monolithic kernels do enforce a logical separation between subsystems especially between the core kernel and device drivers with relatively strict APIs (but not necessarily fixed in stone) that must be used to access services offered by one subsystem or device drivers. This, of course, depends on the particular kernel implementation and the kernel's architecture.

Micro kernel

A micro-kernel is one where large parts of the kernel are protected from each-other, usually running as services in user space. Because significant parts of the kernel are now running in user mode, the remaining code that runs in kernel mode is significantly smaller, hence micro-kernel term.

../_images/ditaa-c8a3d93d0109b7be6f608871d16adff4aaa933da.png

In a micro-kernel architecture the kernel contains just enough code that allows for message passing between different running processes. Practically that means implement the scheduler and an IPC mechanism in the kernel, as well as basic memory management to setup the protection between applications and services.

One of the advantages of this architecture is that the services are isolated and hence bugs in one service won't impact other services.

As such, if a service crashes we can just restart it without affecting the whole system. However, in practice this is difficult to achieve since restarting a service may affect all applications that depend on that service (e.g. if the file server crashes all applications with opened file descriptors would encounter errors when accessing them).

This architecture imposes a modular approach to the kernel and offers memory protection between services but at a cost of performance. What is a simple function call between two services on monolithic kernels now requires going through IPC and scheduling which will incur a performance penalty [2].

[2]https://lwn.net/Articles/220255/

Micro-kernels vs monolithic kernels

Advocates of micro-kernels often suggest that micro-kernel are superior because of the modular design a micro-kernel enforces. However, monolithic kernels can also be modular and there are several approaches that modern monolithic kernels use toward this goal:

  • Components can enabled or disabled at compile time
  • Support of loadable kernel modules (at runtime)
  • Organize the kernel in logical, independent subsystems
  • Strict interfaces but with low performance overhead: macros, inline functions, function pointers

There is a class of operating systems that (used to) claim to be hybrid kernels, in between monolithic and micro-kernels (e.g. Windows, Mac OS X). However, since all of the typical monolithic services run in kernel-mode in these operating systems, there is little merit to qualify them other then monolithic kernels.

Many operating systems and kernel experts have dismissed the label as meaningless, and just marketing. Linus Torvalds said of this issue:

"As to the whole 'hybrid kernel' thing - it's just marketing. It's 'oh, those microkernels had good PR, how can we try to get good PR for our working kernel? Oh, I know, let's use a cool name and try to imply that it has all the PR advantages that that other system has'."


Linux kernel architecture

../_images/ditaa-b9ffae65be16d30be11b5eca188a7a143b1b8227.png


arch

  • Architecture specific code
  • May be further sub-divided in machine specific code
  • Interfacing with the boot loader and architecture specific initialization
  • Access to various hardware bits that are architecture or machine specific such as interrupt controller, SMP controllers, BUS controllers, exceptions and interrupt setup, virtual memory handling
  • Architecture optimized functions (e.g. memcpy, string operations, etc.)

This part of the Linux kernel contains architecture specific code and may be further sub-divided in machine specific code for certain architectures (e.g. arm).

"Linux was first developed for 32-bit x86-based PCs (386 or higher). These days it also runs on (at least) the Compaq Alpha AXP, Sun SPARC and UltraSPARC, Motorola 68000, PowerPC, PowerPC64, ARM, Hitachi SuperH, IBM S/390, MIPS, HP PA-RISC, Intel IA-64, DEC VAX, AMD x86-64 and CRIS architectures.”

It implements access to various hardware bits that are architecture or machine specific such as interrupt controller, SMP controllers, BUS controllers, exceptions and interrupt setup, virtual memory handling.

It also implements architecture optimized functions (e.g. memcpy, string operations, etc.)


Device drivers

The Linux kernel uses a unified device model whose purpose is to maintain internal data structures that reflect the state and structure of the system. Such information includes what devices are present, what is their status, what bus they are attached to, to what driver they are attached, etc. This information is essential for implementing system wide power management, as well as device discovery and dynamic device removal.

Each subsystem has its own specific driver interface that is tailored to the devices it represents in order to make it easier to write correct drivers and to reduce code duplication.

Linux supports one of the most diverse set of device drivers type, some examples are: TTY, serial, SCSI, fileystem, ethernet, USB, framebuffer, input, sound, etc.


Process management

Linux implements the standard Unix process management APIs such as fork(), exec(), wait(), as well as standard POSIX threads.

However, Linux processes and threads are implemented particularly different than other kernels. There are no internal structures implementing processes or threads, instead there is a struct task_struct that describe an abstract scheduling unit called task.

A task has pointers to resources, such as address space, file descriptors, IPC ids, etc. The resource pointers for tasks that are part of the same process point to the same resources, while resources of tasks of different processes will point to different resources.

This peculiarity, together with the clone() and unshare() system call allows for implementing new features such as namespaces.

Namespaces are used together with control groups (cgroup) to implement operating system virtualization in Linux.

cgroup is a mechanism to organize processes hierarchically and distribute system resources along the hierarchy in a controlled and configurable manner.


Memory management

Linux memory management is a complex subsystem that deals with:

  • Management of the physical memory: allocating and freeing memory
  • Management of the virtual memory: paging, swapping, demand paging, copy on write
  • User services: user address space management (e.g. mmap(), brk(), shared memory)
  • Kernel services: SL*B allocators, vmalloc


Block I/O management

The Linux Block I/O subsystem deals with reading and writing data from or to block devices: creating block I/O requests, transforming block I/O requests (e.g. for software RAID or LVM), merging and sorting the requests and scheduling them via various I/O schedulers to the block device drivers.

../_images/ditaa-0a96997f269a7a9cd0cdc9c9125f6e62e549be94.png


Virtual Filesystem Switch

The Linux Virtual Filesystem Switch implements common / generic filesystem code to reduce duplication in filesystem drivers. It introduces certain filesystem abstractions such as:

  • inode - describes the file on disk (attributes, location of data blocks on disk)
  • dentry - links an inode to a name
  • file - describes the properties of an opened file (e.g. file pointer)
  • superblock - describes the properties of a formatted filesystem (e.g. number of blocks, block size, location of root directory on disk, encryption, etc.)

../_images/ditaa-afa57a07e21b1b842554278abe30fea575278452.png

The Linux VFS also implements a complex caching mechanism which includes the following:

  • the inode cache - caches the file attributes and internal file metadata
  • the dentry cache - caches the directory hierarchy of a filesystem
  • the page cache - caches file data blocks in memory


Networking stack

../_images/ditaa-a2ded49c8b739635d6742479583443fb10ad120a.png


Linux Security Modules

  • Hooks to extend the default Linux security model
  • Used by several Linux security extensions:
    • Security Enhancened Linux
    • AppArmor
    • Tomoyo
    • Smack


附注:Introduction — The Linux Kernel documentation (linux-kernel-labs.github.io)

Kernel stack

Each process has a kernel stack that is used to maintain the function call chain and local variables state while it is executing in kernel mode, as a result of a system call.

The kernel stack is small (4KB - 12 KB) so the kernel developer has to avoid allocating large structures on stack or recursive calls that are not properly bounded.

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Publisher: Wrox Page : 1371 This book discusses the concepts, structure, and implementation of the Linux kernel. In particular, the individual chapters cover the following topics: ❑ Chapter 1 provides an overview of the Linux kernel and describes the big picture that is investigated more closely in the following chapters. ❑ Chapter 2 talks about the basics of multitasking, scheduling, and process management, and investigates how these fundamental techniques and abstractions are implemented. ❑ Chapter 3 discusses how physical memory is managed. Both the interaction with hardware and the in-kernel distribution of RAM via the buddy system and the slab allocator are covered. ❑ Chapter 4 proceeds to describe how userland processes experience virtual memory, and the comprehensive data structures and actions required from the kernel to implement this view. ❑ Chapter 5 introduces the mechanisms required to ensure proper operation of the kernel on multiprocessor systems. Additionally, it covers the related question of how processes can communicate with each other. ❑ Chapter 6 walks you through the means for writing device drivers that are required to add support for new hardware to the kernel. ❑ Chapter 7 explains how modules allow for dynamically adding new functionality to the kernel. ❑ Chapter 8 discusses the virtual filesystem, a generic layer of the kernel that allows for supporting a wide range of different filesystems, both physical and virtual. ❑ Chapter 9 describes the extended filesystem family, that is, the Ext2 and Ext3 filesystems that are the standard workhorses of many Linux installations. ❑ Chapter 10 goes on to discuss procfs and sysfs, two filesystems that are not designed to store information, but to present meta-information about the kernel to userland. Additionally, a number of means to ease writing filesystems are presented. ❑ Chapter 11 shows how extended attributes and access control lists that can help to improve system security are implemented. ❑ Chapter 12 discusses the networking implementation of the kernel, with a specific focus on IPv4, TCP, UDP, and netfilter. ❑ Chapter 13 introduces how systems calls that are the standard way to request a kernel action from userland are implemented. ❑ Chapter 14 analyzes how kernel activities are triggered with interrupts, and presents means of deferring work to a later point in time. ❑ Chapter 15 shows how the kernel handles all time-related requirements, both with low and high resolution. ❑ Chapter 16 talks about speeding up kernel operations with the help of the page and buffer caches. ❑ Chapter 17 discusses how cached data in memory are synchronized with their sources on persistent storage devices. ❑ Chapter 18 introduces how page reclaim and swapping work.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值