Linux namespace

最新推荐文章于 2022-11-13 21:30:55 发布

killmice

最新推荐文章于 2022-11-13 21:30:55 发布

阅读量480

点赞数

分类专栏： network linux 应用

network 同时被 2 个专栏收录

58 篇文章 3 订阅

订阅专栏

linux 应用

29 篇文章 0 订阅

订阅专栏

本文介绍了Linux内核中Namespaces的功能，它可以隔离并虚拟化进程ID、网络资源等系统资源。Namespaces是容器技术的基础，支持多种类型如Mount、PID、Network等，并详细解释了每种类型的工作原理。

摘要由CSDN通过智能技术生成

Linux namespaces

From Wikipedia, the free encyclopedia

For namespaces in general, see Namespace.

This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages)

This article needs additional citations for verification. (March 2016)

This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. (March 2016)

This article's tone or style may not reflect the encyclopedic tone used on Wikipedia. (May 2016)

This article's lead section may not adequately summarize key points of its contents. (May 2016)

namespaces
Original author(s)	Al Viro
Developer(s)	Eric W. Biederman, Pavel Emelyanov, Al Viro, Cyrill Gorcunov et al.
Initial release	2002; 14 years ago
Written in	C
Operating system	Linux
Type	System software
License	GPL and LGPL

Namespaces are a feature of the Linux kernel that isolates and virtualizes system resources of a collection of processes. Examples of resources that can be virtualized include process IDs, hostnames, user IDs, network access, interprocess communication, and filesystems. Namespaces are a fundamental aspect of containers on Linux.

Linux developers use the term namespace to refer to both the namespace kinds, as well as to specific instances of these kinds.

A Linux system is initialized with a single instance of each namespace type. After initialization, additional namespaces can be created or joined.

History

This section needs expansion. You can help by adding to it. (September 2016)

Linux namespaces were inspired by the more general namespace functionality used heavily throughout Plan 9 from Bell Labs.^[1]

The Linux Namespaces originated in 2002 in the 2.4.19 kernel with work on the mount namespace kind. Additional namespaces were added beginning in 2006^[2] and continuing into the future.

Adequate containers support functionality was finished in kernel version 3.8 with the introduction of User namespaces.

Namespace kinds

As of kernel version 3.8, there are 6 kinds of namespaces. Namespace functionality is the same across all kinds: each process is associated with a namespace and can only see or use the resources associated with that namespace, and descendant namespaces where applicable. This way each process (or group thereof) can have a unique view on the resource. Which resource is isolated depends on the kind of the namespace has been created for a given process group.

Mount (mnt)

Mount namespaces control mount points. Upon creation the mounts from the current mount namespace are copied to the new namespace, but mount points created afterwards do not propagate between namespaces (using shared subtrees, it is possible to propagate mount points between namespaces^[3]).

The clone flag CLONE_NEWNS - short for "NEW NameSpace" - was used because the mount namespace kind was the first to be introduced. At the time nobody thought of other namespaces but the name has stuck for backwards compatibility.

Process ID (pid)

The PID namespace provides processes with an independent set of process IDs (PIDs) from other namespaces. PID namespaces are nested, meaning when a new process is created it will have a PID for each namespace from its current namespace up to the initial PID namespace. Hence the initial PID namespace is able to see all processes, albeit with different PIDs than other namespaces will see processes with.

The first process created in a PID namespace is assigned the process id number 1 and receives most of the same special treatment as the normal init process, most notably that orphaned processes within the namespace are attached to it. This also means that the termination of this PID 1 process will immediately terminate all processes in its PID namespace and any descendants.^[4]

Network (net)

Network namespaces virtualize the network stack. On creation a network namespace contains only a loopback interface.

Each network interface (physical or virtual) is present in exactly 1 namespace and can be moved between namespaces.

Each namespace will have a private set of IP addresses, its own routing table, socket listing, connection tracking table, firewall, and other network-related resources.

On its destruction, a network namespace will destroy any virtual interfaces within it and move any physical interfaces back to the initial network namespace.

Interprocess Communication (ipc)

IPC namespaces isolate processes from SysV style inter-process communication. This prevents processes in different IPC namespaces from using, for example, the SHM family of functions to establish a range of shared memory between the two processes. Instead each process will be able to use the same identifiers for a shared memory region and produce two such distinct regions.

UTS

UTS namespaces allow a single system to appear to have different host and domain names to different processes.

User ID (user)

User namespaces are a feature to provide both privilege isolation and user identification segregation across multiple sets of processes. With administrative assistance it is possible to build a container with seeming administrative rights without actually giving elevated privileges to user processes. Like the PID namespace, user namespaces are nested and each new user namespace is considered to be a child of the user namespace that created it.

A user namespace contains a mapping table converting user IDs from the container's point of view to the system's point of view. This allows, for example, the root user to have user id 0 in the container but is actually treated as user id 1,400,000 by the system for ownership checks. A similar table is used for group id mappings and ownership checks.

To facilitate privilege isolation of administrative actions, each namespace type is considered owned by a user namespace based on the active user namespace at the moment of creation. A user with administrative privileges in the appropriate user namespace will be allowed to perform administrative actions within that other namespace type. For example, if a process has administrative permission to change the IP address of a network interface, it may do so as long as the applicable network namespace is owned by a user namespace that either matches or is a child (direct or indirect) of the process' user namespace. Hence the initial user namespace has administrative control over all namespace types in the system.^[5]

cgroup namespace

To prevent leaking the control group to which a process belongs, a new namespace type has been suggested^[6] and created to hide the actual control group a process is a member of. A process in such a namespace checking which control group any process is part of would see a path that is actually relative to the control group set at creation time, hiding its true control group position and identity.

Proposed namespaces

time namespace

This section needs expansion. You can help by adding to it. (September 2016)

syslog namespace

This section needs expansion. You can help by adding to it. (September 2016)

Implementation Details

The kernel assigns each process a symbolic link per namespace kind in /proc/<pid>/ns/. The inode number pointed to by this symlink is the same for each process in this namespace. This uniquely identifies each namespace by the inode number pointed to by one of its symlinks.

Reading the symlink via readlink returns a string containing the namespace kind name and the inode number of the namespace.

Syscalls

Three syscalls can directly manipulate namespaces:

clone, flags to specify which new namespace the new process should be migrated to.
unshare, flags to specify which new namespace the current process should be migrated to.
setns, enters the namespace specified by a file descriptors.

Destruction

If a namespace is no longer referenced, it will be deleted, the handling of the contained resource depends on the namespace kind. Namespaces can be referenced in three ways:

a process belonging to the namespace
an open filedescriptor to the namespace's file (/proc/<pid>/ns/<ns-kind>)
a bind mount of the namespace's file (/proc/<pid>/ns/<ns-kind>)

Adoption

Various container software use Linux namespaces in combination with cgroups to isolate their processes, including Docker^[7] and LXC.

Other applications, such as Google Chrome make use of namespaces to isolate its own processes which are at risk from attack on the internet.

There is also an unshare wrapper in util-linux. An example to its use is

SHELL=/bin/sh unshare --fork --pid chroot "${chrootdir}" "$@"

killmice

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

Linux namespace

Linux namespaces

Contents

History

Namespace kinds

Mount (mnt)

Process ID (pid)

Network (net)

Interprocess Communication (ipc)

UTS

User ID (user)

cgroup namespace

Proposed namespaces

time namespace

syslog namespace

Implementation Details

Syscalls

Destruction

Adoption