unix cp 原理_Unix的工作原理:您都不敢问的一切

unix cp 原理

重点 (Top highlight)

Unix的工作原理:您都不敢问的一切 (How Unix Works: Everything You Were Too Afraid to Ask)

文件和进程的故事。 成为更好的软件工程师 (A tale of files and processes. Become a better software engineer)

Image for post

Unix is beautiful. Allow me to paint some happy little trees for you. I’m not going to explain a bunch of commands — that’s boring, and there are a million tutorials on the web doing that already.

Unix很美。 请允许我为您画一些快乐的小树 。 我不会解释一堆命令,这很无聊,网络上已经有上百万的教程正在这样做。

I’m going to leave you with the ability to reason about the system.

我将让您具有推理系统的能力。

Every fancy thing you want done is one Google search away. But, understanding why the solution does what you want is not the same.That’s what gives you real power, the power to not be afraid.[1]

您想做的每件奇特的事都只需一个Google搜索即可。 但是,理解为什么解决方案可以实现您想要的功能并不相同。这就是赋予您真正的力量,不怕恐惧的力量。[1]

And, since it rhymes, it must be true.

并且,因为它押韵,所以它必须是真实的。

I’ll put just enough commands for us to play along, assuming you’re starting from scratch. We’ll explore concepts, see them in practice in a shell, and then scream: “I get this!”.

假设您是从头开始的,我将提供足够的命令供我们使用。 我们将探索概念,在shell中实际使用它们,然后大喊:“我明白了!”。

Along the way, we’ll also figure out what a shell really is.

在此过程中,我们还将弄清楚shell到底是什么。

But we can’t begin without getting into the minds of the creators: exploring Unix’s philosophy.

但是,我们必须首先进入创建者的思想:探索Unix的哲学。

For now, we can assume Linux is Unix. If you want to know why that’s not really the case, you can skip to the bottom and come back. We’ll end the Unix vs. Linux confusion once and for all.

现在,我们可以假设Linux是Unix。 如果您想知道事实并非如此,则可以跳到最底端再返回。 我们将一劳永逸地结束Unix与Linux的混淆。

哲学 (Philosophy)

Let’s start at the core — the philosophy behind Unix.

让我们从核心开始-Unix背后的哲学。

  • Write programs that do one thing and do it well.

    编写可以做一件事并且做得很好的程序。
  • Write programs to work together. (no extra output, don’t insist on interactive input.)

    编写程序以协同工作。 (没有额外的输出,不要坚持交互式输入。)
  • Write programs to handle text streams, because that is a universal interface.

    编写程序来处理文本流,因为这是一个通用接口。

Unix also embraced the worse is better philosophy.

Unix也接受了越差越好的哲学。

This thinking is powerful. On a higher level, we see it a lot in functional programming: Build atomic functions that focus on one thing, no extra output, and then compose them together to do complicated things.

这种想法是有力的。 在更高的层次上,我们在函数式编程中看到了很多东西:构建仅关注一件事的原子函数,没有额外的输出,然后将它们组合在一起以完成复杂的事情。

All functions in the composition are pure. No global variables to keep track of.

组成中的所有功能都是纯函数。 没有全局变量可以跟踪。

Perhaps as a direct result, the design of Unix focuses on two major components: processes and files. Everything in Unix is either a process or a file. Nothing else.

也许是直接的结果,Unix的设计侧重于两个主要组件:进程和文件。 Unix中的所有内容都是进程或文件。 没有其他的。

There’s a cyclical dependency between processes and files — if we start to go in-depth into the explanation of either, we’ll need the other to support it. To break this loop, we’ll start with a brief overview of each and then dive in. And we’ll repeat this a few times to get to the bottom.

进程和文件之间存在周期性的依赖关系-如果我们开始深入解释其中一个,我们将需要另一个来支持它。 为了打破这个循环,我们将首先简要概述每个内容,然后再深入研究。我们将重复几次以达到最低要求。

Craft.io流程 (Processes)

The browser you’re running is a process. So is the terminal, if you have it open.[2] If not, now is a good time to open it. If you’re on Windows, Docker works really well too. If you’re on a Mac, use the Terminal — it’s a Unix environment.

您正在运行的浏览器是一个过程。 如果您将其打开,那么终端也是如此。[2] 如果没有,现在是打开它的好时机。 如果您使用Windows,那么Docker也可以很好地工作 。 如果您使用的是Mac,请使用终端机-这是Unix环境。

In more abstract terms, a process is a running instance of code. The operating system gives resources to the process (like memory), then attaches some metadata to it (like who’s the owner), and then runs the code.

用更抽象的术语来说,流程是运行中的代码实例。 操作系统为进程提供资源(如内存),然后将一些元数据附加到进程(如谁是所有者),然后运行代码。

The operating system, as a part of the resources, also provides three open files to every process: stdin , stdout , and stderr .

作为资源的一部分,操作系统还为每个进程提供了三个打开文件: stdin stdout stderr

档案 (Files)

Everything that is not a process is a file.

不是进程的所有内容都是文件。

Yes, this means your printer, scanner, the terminal screen, the code for any process! They’re all files. If this sounds confusing, read on. We’ll clear this up.

是的,这意味着您的打印机,扫描仪,终端屏幕以及任何过程的代码! 它们都是文件。 如果这听起来令人困惑,请继续阅读。 我们将清除此问题。

Your files in the filesystem are files — a string of bytes strung together to create something meaningful. Your photos are files. Your directories are files, too! They just contain a list of files/directories present in the current directory, exactly like a tree.

您在文件系统中的文件是文件—一串字节串在一起,以创建有意义的东西。 您的照片是文件。 您的目录也是文件! 它们只包含当前目录中存在的文件/目录的列表,就像树一样。

The beauty of this is that I can “open” a directory file to see the contents as well!

这样做的好处是我可以“打开”目录文件来查看内容!

For example:

例如:

$ vim .
" ====================================================================
" Netrw Directory Listing (netrw v162)
" /Users/Neil/examples
" Sorted by size
" Quick Help: <F1>:help -:go up dir D:delete R:rename s:sort-by x:special
" ===================================================================
../
./
git-example/
unix-git/
unix-file-system-example/

I used Vim to open a file called .. Does this sound familiar? It’s how Unix stores the current directory. As you can see, it contains a list of files/directories in the current directory.

我用Vim打开了一个名为的文件. 。 这听起来很熟悉吗? Unix就是这样存储当前目录的。 如您所见,它包含当前目录中的文件/目录列表。

A file is just a stream of data.

文件只是数据流。

文件和文件系统 (Files and the File System)

With the idea of everything is a file and a file is a stream of data in place, we can explore how things work further.

有了一切的想法, 文件就是文件,数据就成了数据流 ,我们可以探索事情如何进一步发展。

On a Unix system, the streams for getting input and writing output are predefined. This is precisely what standard input, stdin, standard output, stdout, and standard error, stderr, are for.

在Unix系统上,用于获取输入和写入输出的流是预定义的。 这正是标准输入stdin ,标准输出stdout和标准错误stderr目的。

  • stdin is the input data source.

    stdin是输入数据源。

  • stdout is the output data source.

    stdout是输出数据源。

  • stderr is the standard error output source.

    stderr是标准错误输出源。

In a shell[3], the stdin is input data from your keyboard, and both stdout and stderr are the screen.[4]

shell [3]中, stdin是来自键盘的输入数据,并且stdoutstderr都是屏幕。[4]

Now, we can redirect these streams to somewhere else, and our program doesn’t have to know! Irrespective of where the input comes from (keyboard or a text file), for any running process, it’s coming from stdin.

现在,我们可以将这些流重定向到其他地方,而我们的程序不必知道! 无论输入来自何处(键盘或文本文件),对于任何正在运行的进程, stdin来自stdin

Likewise, for stdout and stderr. We’ll talk more about this when we get to processes that work with these streams.

同样,对于stdoutstderr 。 当我们进入与这些流一起工作的流程时,我们将详细讨论这一点。

索引节点 (Inodes)

To have a file system in place, you need a structure to manage the filesystem. There isn’t just data in the file to take care of, but information about the data itself, called metadata. This includes where the data is stored, who owns it, and who can see it.

要安装文件系统,您需要一个结构来管理文件系统。 文件中不仅要处理数据,还涉及有关数据本身的信息,称为元数据。 这包括数据的存储位置,拥有者和查看者。

This is what inodes are — a data structure for your file metadata. Every file has a unique inode number. This becomes the unique identifier for the file while it exists.

这就是inode,即文件元数据的数据结构。 每个文件都有一个唯一的索引节点号。 当它存在时,它将成为文件的唯一标识符。

$ ls -li
total 0
2015005 drwxr-xr-x 6 neil X 192 23 Oct 07:36 git-example
2514988 drwxr-xr-x 4 neil X 128 9 Oct 11:37 unix-git/
2020303 drwxr-xr-x 4 neil X 128 23 Sep 11:46 unix-file-system-example/

See those numbers in the first column? Those are the inodes![5]

在第一栏中看到那些数字? 这些是inode![5]

An inode store all the metadata. stat is useful for looking at this metadata too.

索引节点存储所有元数据。 stat对于查看此元数据也很有用。

$ stat -LF .
drwxrwxrwx 7 A B 224 Oct 28 07:15:48 2018 ./

The A and B are the user and group names. Unix is a multi-user system. This is how Unix does it — users and groups are attributes of a file.

AB是用户名和组名。 Unix是一个多用户系统。 Unix就是这样-用户和组是文件的属性。

A file with the user attribute set to X means that X owns the file. That’s all a user is to Unix.

用户属性设置为X的文件表示X拥有该文件。 这就是使用Unix的所有用户。

224 is the file size, or the number of bytes in the file.

224是文件大小,或文件中的字节数。

Oct 28 07:15:48 2018 is the date last modified.[6]

Oct 28 07:15:48 2018是最后修改的日期。[6]

Where did I get all this information from? man ls.

我从哪里获得所有这些信息? man ls

Now come the interesting numbers and characters I’ve left out: drwxrwxrwx and 7.

现在来看看我遗漏的有趣的数字和字符: drwxrwxrwx7

档案权限 (File permissions)

Each file has permissions associated with it.

每个文件都具有与其关联的权限。

Remember the user and group associated with the file? Every file stores who owns the file and which group the file belongs to. Likewise, every user also has a username and group.

还记得与文件关联的用户和组吗? 每个文件都存储文件的所有者和文件所属的组。 同样,每个用户也都有一个用户名和组。

Image for post

Coming to the -rwxrwxrwx string: this is the permissions for the owner, the group, and others.

进入-rwxrwxrwx字符串:这是所有者,组和其他人的权限。

  • r is for reading.

    r用于阅读。

  • w is for writing.

    w是写作。

  • x is for executing. For directories, this means being searchable.

    x用于执行。 对于目录,这意味着可以搜索。

You only need three bits to represent permissions for each of user, group, and others.

您只需要三个位来代表每个用户,组和其他用户的权限。

You’ll notice that the string has 10 characters. The first one is a special entry type character to distinguish between directories, symlinks, character streams (stdin), and a few others. man ls to know more.

您会注意到该字符串有10个字符。 第一个是特殊的条目类型字符,用于区分目录,符号链接,字符流(stdin)和其他一些字符。 man ls想知道更多。

$ stat -LF
crw--w---- 1 neil tty 16,1 Dec 12 07:45:00 2019 (stdin)

What if you wanted to change permissions? Say, I don’t want anyone to search my personal folder (ahem).

如果您想更改权限怎么办? 说,我不要任何人搜索我的个人文件夹(哎呀)。

The creators of Unix thought about that. There’s a utility called chmod, which can modify permissions on files. In the back end, you know now that chmod is interacting with the file’s inode.

Unix的创建者对此进行了思考。 有一个名为chmod的实用程序,可以修改文件的权限。 在后端,您现在知道chmod正在与文件的inode交互。

Since we need three bits to represent each permission, we can convert that to an integer and pass that to chmod.

由于我们需要三个位来表示每个权限,因此我们可以将其转换为整数并将其传递给chmod

For example: chmod 721 . would mean rwx-w---x which means all permissions for the owner, write permissions to the group, and execute permissions to others.

例如: chmod 721 . 表示rwx-w---x表示所有者的所有权限,对该组的写权限以及对其他人的执行权限。

I like the verbose form better:

我更喜欢详细的形式:

$ chmod u+rwx . # enable user for rwx
$ chmod g+w . # enable group for w
$ chmod o+x . # enable others for x

You’re doing the exact same thing here. To set permissions for everyone, chmod a+x <file> is so much more easier! You could remove permissions as well, using - instead of +.

您在这里做的完全一样。 要为所有人设置权限, chmod a+x <file>更加容易! 您也可以使用-代替+删除权限。

To restrict access to my personal folder, I’ll do: chmod og-x nothing-interesting-here/.

为了限制对我个人文件夹的访问,我将执行以下操作: chmod og-x nothing-interesting-here/

You can also restrict access to yourself, removing all read, write, and execution permissions for yourself. If the file metadata were stored in the file itself, you wouldn’t be able to change permissions again (since you can’t write to the file).

您还可以限制对自己的访问,从而删除自己的所有读取,写入和执行权限。 如果文件元数据存储在文件本身中,则您将无法再次更改权限(因为您无法写入文件)。

That’s another reason why inodes are cool: they can always be modified by the file owner and root, so you can restore your permissions. Try doing this.

这就是inode很酷的另一个原因:它们总是可以由文件所有者和root修改,因此您可以恢复权限。 尝试这样做。

档案连结 (File linking)

Ever wondered why moving a Gigabyte file from one directory to another is blazing fast, while copying the same might take ages? Can you guess why now?

有没有想过为什么将千兆字节文件从一个目录移动到另一个目录会很快,而复制相同文件却要花一些时间呢? 你能猜出现在为什么吗?

It’s because when we mv, we’re moving the directory structure, not the actual file data. The inodes are a very useful abstraction over the filesystem.

这是因为当我们使用mv ,我们将移动目录结构,而不是实际的文件数据。 索引节点是文件系统上非常有用的抽象。

There’s other kinds of moving we can do. We can link files from one place to another, or make two filenames point to the same file.

我们还可以采取其他行动。 我们可以将文件从一个位置链接到另一位置,或使两个文件名指向同一文件。

Two filenames that point to the same file are hard links. Think of them as aliases for a file. You’ve already seen two hard links: . and .. are hard links to the current and parent directory in the system.

指向同一文件的两个文件名是硬链接 。 将它们视为文件的别名。 您已经看到了两个硬链接: ...是到系统中当前目录和父目录的硬链接。

Links from one place to another are symbolic links. A symbolic link is a new file, separate from original, that links to the original file.

从一个地方到另一个地方的链接符号链接 。 符号链接是与原始文件分开的新文件,该文件链接到原始文件。

These are useful when you want to fix scripts that need to run in a new environment, or to make a copy of a file in order to satisfy installation requirements of a new program that expects the file to be in another location.

当您要修复需要在新环境中运行的脚本或制作文件副本以满足新程序的安装要求(希望该文件位于其他位置)时,这些功能很有用。

$ ls -li
total 0
25280489 -rw-r--r-- 1 neil X 0 8 Dec 08:48 a$ man ln # to check syntax for hard links$ ln a x # create x as a hard link to a$ ls -li
total 0
25280489 -rw-r--r-- 2 neil X 0 8 Dec 08:48 a
25280489 -rw-r--r-- 2 neil X 0 8 Dec 08:48 x# Notice both files have the same inode number.
# Modifying x or a is the same thing - both files get modified together.$ ln -s a y # create symbolic link to a$ ls -li
total 0
25280489 -rw-r--r-- 2 neil X 0 8 Dec 08:48 a
25280489 -rw-r--r-- 2 neil X 0 8 Dec 08:48 x
25280699 lrwxr-xr-x 1 neil X 1 8 Dec 08:54 y -> a# y is a symbolic link, a new small file - see size is 1. $ cat y # shows that y (a) is empty$ echo lsd >> y
$ cat y
lsd
$ cat a # modifying y modifies a
lsd
$ cat x # a is x
lsd

I’ve explained what’s happening above in the comments.

我已经在评论中解释了上面的情况。

Now, what happens if you remove a, the file that y points to?

现在,如果你删除会发生什么a ,该文件是y点?

$ rm a$ cat y
cat: y: No such file or directory# y becomes useless$ ls -li
25280489 -rw-r--r-- 1 neil X 12 8 Dec 08:56 x
25280699 lrwxr-xr-x 1 neil X 1 8 Dec 08:54 y -> a

This is a dangling symlink. It’s useless.

这是一个悬空的符号链接。 这毫无用处。

The number after read-write permissions, or the 7 from when we did stat -LF . is the count of hard links to a file.

读写许可权后的数字,即执行stat -LF .时的7 stat -LF . 是文件的硬链接数。

When I created x, the number went up to two. When I removed a, the number went back down to one.

当我创建x ,数字增加到两个。 当我删除a ,数量又回落到一个。

We can also confirm that . and .. are indeed a hard link. Can you think how?

我们也可以确认...确实是硬链接。 你觉得怎么样?

$ ls -ail
25280488 drwxr-xr-x 7 neil X 224 9 Dec 20:19 .
1289985 drwxr-xr-x+ 83 neil X 2656 10 Dec 08:13 ..
25390377 drwxr-xr-x 5 neil X 160 9 Dec 19:13 sample_dir$ cd sample_dir
$ ls -ail
25390377 drwxr-xr-x 5 neil X 160 9 Dec 19:13 .
25280488 drwxr-xr-x 7 neil X 224 9 Dec 20:19 ..
25390378 -rw-r--r-- 1 neil X 0 9 Dec 19:13 a

Check the inode numbers. .. in sample_dir is 25280488, which is the same as . in the parent directory. Also, sample_dir in the parent directory is 25390377, which is the same as . inside sample_dir.

检查索引节点号。 ..sample_dir是25280488,这是相同的. 在父目录中。 另外,父目录中的sample_dir是25390377,与相同.sample_dir

档案结构 (File structure)

It helps me to imagine the file system like a tree data structure (indeed, that’s what it is). Every node (inode) has a pointer to its parent, itself, and all its children. This forms the directory structure.

它可以帮助我想象文件系统像树形数据结构(实际上就是它)。 每个节点(索引节点)都有一个指向其父节点,自身及其所有子节点的指针。 这形成目录结构。

Image for post

What’s the parent of /, the root directory?

/的父目录是什么?

You have enough knowledge to answer this question now. The first thing I did was vim / to see if / has a parent pointer. It does. Then, I did ls -ail to see the inode of the parent. It points to ., which is /.

您已有足够的知识来回答这个问题。 我做的第一件事是vim /看是否/有父指针。 是的 然后,我做了ls -ail来查看父级的ls -ail节点。 它指向. ,是/

In summary,

综上所述,

  • The file system is built using inodes and directory files.

    文件系统是使用inode和目录文件构建的。
  • Users are attributes of files and processes. This information is stored in the inodes.

    用户是文件和进程的属性。 此信息存储在inode中。
  • Inodes are unique within a file system.

    索引节点在文件系统中是唯一的。
  • Multiple file systems can be mounted and abstracted into one logical tree.

    可以挂载多个文件系统并将其抽象为一个逻辑树。

Craft.io流程 (Processes)

First, let’s get the definitions out of the way. There are three components to remember about a process:

首先,让我们摆脱这些定义。 关于流程,要记住三个组件:

  1. Program file: The code and data.

    程序文件:代码和数据。
  2. Process image: This stores the stack, variables currently defined, data, address space, and more. When it’s time to run, the OS knows exactly how to recreate the process using this image.

    过程映像:这将存储堆栈,当前定义的变量,数据,地址空间 。 在运行时,操作系统确切知道如何使用此映像重新创建过程。

  3. Process: The running program in memory.

    进程:内存中正在运行的程序。

When a process starts running, it inherits the user ID and group ID from the parent process. This information controls the level of access to the process.

进程开始运行时,它将从父进程继承用户ID和组ID。 此信息控制对该进程的访问级别。

Note: Access control is crucial for a secure system. This is one of the reasons why running bare Docker containers in production can be such a problem: it needs to run as root, which means bad things can happen.

注意: 访问控制对于安全系统至关重要。 这就是为什么在生产环境中运行裸Docker容器可能会出现这样的问题的原因之一:它需要以root身份运行,这意味着可能会发生不好的事情

We can use setuid or setgid to enable a process to inherit the file owner permissions. setuid allows a process to inherit the userID of the file in question.

我们可以使用setuidsetgid来使进程继承文件所有者权限。 setuid允许进程继承相关文件的userID

For example, to change passwords on Linux (see this link for Mac) — we need to modify the file /etc/passwd. However, on checking permissions, we see that only root has access to write to this file.[7]

例如,要在Linux上更改密码(请参阅Mac的此链接 ),我们需要修改文件/etc/passwd 。 但是,在检查权限时,我们看到只有root有权访问此文件。[7]

$ ls -ail /etc/passwd
3541354 -rw-r--r-- 1 root root 681 Nov 28 08:47 /etc/passwd

Thus, when we call /usr/bin/passwd, the utility to help change passwords, it will inherit our user ID, which will get access denied to /etc/passwd. This is where setuid comes in useful — it allows us to start usr/bin/passwd as root.

因此,当我们调用帮助更改密码的实用程序/usr/bin/passwd ,它将继承我们的用户ID,该用户ID将导致对/etc/passwd访问被拒绝。 这是setuid有用的地方-它允许我们以root身份启动usr/bin/passwd

$ ls -al /usr/bin/passwd 
-rwsr-xr-x 1 root root 27936 Mar 22 2019 /usr/bin/passwd

The s instead of x in execution permissions shows that this process will run as root.

执行权限中的s而不是x表明此过程将以root用户身份运行。

To set and remove this bit, we can use chmod again.

要设置和删除该位,我们可以再次使用chmod

$ chmod u-s /usr/bin/passwd 
$ ls -al /usr/bin/passwd
-rwxr-xr-x 1 root root 27936 Mar 22 2019 /usr/bin/passwd

I did this in Docker, so my true filesystem is safe.

我是在Docker上完成的,因此我真正的文件系统是安全的。

属性 (Attributes)

Like how all files on the filesystem have a unique inode, processes also have their unique identifiers called process IDs, or pid.

就像文件系统上的所有文件都具有唯一的索引节点一样,进程也具有称为进程ID或pid的唯一标识符。

Like how all files have a link to their parent directory, every process has a link to the parent process that spawned it.

就像所有文件都具有指向其父目录的链接一样,每个进程都有与生成它的父进程的链接。

Like how the root of the filesystem exists (/), there’s a special root parent process called init. It usually has pid 1.

就像文件系统根目录( / )一样,有一个特殊的根父进程init 。 它通常具有pid 1。

Unlike the root of the filesystem whose parent directory is itself (/), The ppid of init is 0, which conventionally means it has no parent. The pid 0 corresponds to the kernel scheduler, which isn’t a user process.

父目录本身( / )的文件系统根目录ppidinitppid为0,这通常意味着它没有父目录。 pid 0对应于内核调度程序,它不是用户进程。

生命周期 (Lifecycle)

There’s a common pattern in Unix on how processes work.

在Unix中,有一个关于进程如何工作的通用模式。

A new child process is created by cloning the existing parent process (fork()). This new child process calls (exec()) to replace the parent process running in the child with the process the child wants to run.

通过克隆现有的父进程( fork() ),可以创建一个新的子进程。 这个新的子进程调用( exec() )将子进程中运行的父进程替换为子进程要运行的进程。

Image for post

Next, the child process calls exit() to terminate itself. It only passes an exit code out. 0 means success, everything else is an error code.

接下来,子进程调用exit()终止自身。 它只会传递退出代码。 0表示成功,其他所有都是错误代码。

The parent process needs to call the wait() system call to get access to this exit code. This cycle repeats for every process spawned.

父进程需要调用wait()系统调用才能访问此退出代码。 对于每个产生的过程都重复此循环。

There are a few things that might go wrong here.

这里有些事情可能会出错。

What if the parent doesn’t call wait()? This results in a zombie process, which is a resource leak, since the OS can’t clean up processes before their exit code has been consumed by the parent.

如果父母不调用wait()怎么办? 这导致僵尸进程,这是资源泄漏,因为操作系统无法在父进程消耗其退出代码之前清理进程。

What if the parent dies before the child process? This results in an orphan process (I promise I’m not making this up). An orphan process is adopted by the init process (the special root parent), which then waits on the child process to finish.

如果父母在子进程之前去世怎么办? 这将导致一个孤立的过程(我保证我不会对此进行弥补)。 init进程(特殊的根父进程)采用了孤立进程,然后waits子进程完成。

In the natural order of the computer world, children die before parents.

在计算机世界的自然秩序中,孩子死于父母之前。

How can the parent get access to more information from the child? It can’t via exit codes, since that’s the only thing a process can return to the parent.

父母如何从孩子那里获得更多信息? 它不能通过退出代码,因为这是进程唯一可以返回到父级的东西。

Processes aren’t like regular functions where you can just return the response to the calling function. However, there are other ways to do inter-process communication

流程与常规函数不同,在常规函数中,您可以将响应返回给调用函数。 但是, 还有其他方法可以进行进程间通信

We will go into more detail about how things work with an example. Before that, we need a bit more information.

我们将通过一个示例来详细介绍事物的工作方式。 在此之前,我们需要更多信息。

文件重定向 (File redirection)

Remember how the OS provides three open files to every running process? It’s in our power to redirect these files to whatever we want.

还记得操作系统如何为每个正在运行的进程提供三个打开的​​文件吗? 我们有权将这些文件重定向到我们想要的任何文件。

> redirects stdout, 2> redirects stderr , and < redirects stdin.

>重定向stdout2>重定向stderr ,和<重定向stdin

For example, ./someBinary 2>&1 redirects stderr to stdout.

例如,./ ./someBinary 2>&1stderr重定向到stdout

0, 1, and 2 are shorthand for stdin, stdout, and stderr files respectively.

0、1和2分别是stdinstdoutstderr文件的简写。

Note: ./someBinary 2>1 wouldn’t work like you expect it to, because the syntax is file-descriptor > file. 2>1 means stderr will be redirected to a file called 1. The & operator gives the file descriptor from the file.

注意: ./someBinary 2>1不会像您期望的那样工作,因为语法是file-descriptor > file2>1表示stderr将被重定向到名为1的文件。 &运算符提供文件中的文件描述符。

The file redirection happens before the command runs. When the OS opens the new files (via >), it deletes everything that’s in those files already.

文件重定向在命令运行之前发生。 当操作系统打开新文件(通过> )时,它将删除这些文件中的所有内容。

Therefore, sort res.txt > res.txt won’t work.

因此, sort res.txt > res.txt将不起作用。

$ cat res.txt # check contents of res
d
c
b
a$ sort res.txt # sort res
a
b
c
d$ sort res.txt > res.txt
$ cat res
# empty

Tip: You can ensure none of your redirects clobber an existing file by setting the noclobber option in the shell.

提示:通过在外壳程序中设置noclobber选项,可以确保没有任何重定向文件破坏现有文件。

$ set -o noclobber
$ sort res.txt > res.txt
-bash: res.txt: cannot overwrite existing file

It would, however, work with >>, since in this case, you’re appending to the file.

但是,它将与>> ,因为在这种情况下,您将附加到文件中。

$ sort res.txt >> res.txt
$ cat res.txt
d
c
b
a
a
b
c
d

Read more about redirection.

阅读有关重定向的更多信息

Unix中的层 (Layers in Unix)

We can think of Unix like an onion. At the center is the hardware — the motherboards, the CPUs, and lots of transistors I don’t quite understand. One layer out is the kernel.

我们可以将Unix视为洋葱。 核心是硬件-主板,CPU和许多我不太了解的晶体管。 一层是内核。

内核 (The kernel)

The kernel is the core responsible for interaction with file system and devices. It also handles process scheduling, task execution, memory management, and access control.

内核是负责与文件系统和设备进行交互的核心。 它还处理流程调度,任务执行,内存管理和访问控制。

The kernel exposes API calls for anything built on top to leverage. The most popular ones are exec(), fork(), and wait().

内核公开了API调用,以构建任何可以利用的东西。 最受欢迎的是exec()fork()wait()

Unix实用程序 (Unix utilities)

Another layer up are the Unix utilities. These are super helpful processes that help us interact with the kernel. They do this via system calls like exec() and fork(), which the kernel provides.

另一层是Unix实用程序。 这些是超级有用的过程,可帮助我们与内核进行交互。 他们通过内核提供的exec()fork()类的系统调用来执行此操作。

You’ve probably heard of a lot of utilities already. You’ve probably used the most famous one: shell.

您可能已经听说过许多实用程序。 您可能使用了最著名的一个: shell

Others include: python, gcc, vi, sh, ls, cp, mv, cat, awk.

其他包括: pythongccvishlscpmvcatawk

You can invoke most of them from the shell. bash, zsh, ksh are just different variants of a shell. They do the same thing.

您可以从shell调用其中的大多数。 bashzshksh只是shell不同变体。 他们做同样的事情。

Another utility that people find daunting is the text editor Vim. Vim deserves its own post, and that’s what I’ve created here

人们发现令人望而生畏的另一个实用工具是文本编辑器Vim 。 Vim值得拥有自己的职位, 这就是我在这里创建的

Fun fact: A shell is called a shell because it’s the closest layer outside the kernel. It covers the kernel in a protective … shell.

有趣的事实:一个shell被称为shell ,因为它是在内核之外最接近层。 它在保护性的...外壳中覆盖了内核。

外壳如何工作 (How the Shell Works)

Remember how shell is a process? Which means when it’s started, the OS provides three files for it to work with: stdin, stdout, and stderr.

还记得shell是一个过程吗? 这意味着在启动时,操作系统会提供三个文件供其使用: stdinstdoutstderr

When run from the terminal, stdin is connected to the keyboard input. What you write is passed into the terminal. This happens via a file called tele typewriter, or tty.

从终端运行时, stdin连接到键盘输入。 您编写的内容将传递到终端中。 这是通过名为tele typewritertty的文件发生的。

stdout and stderr is connected to tty too, which is why the output and errors of any command you run show up in the terminal.

stdoutstderr也连接到tty ,这就是为什么您运行的任何命令的输出和错误都显示在终端上的原因。

Image for post

Every terminal you open gets assigned a new file via tty, so that commands from one terminal don’t clobber another. You can find out the file your terminal is attached to via the tty command.

您打开的每个终端都会通过tty分配一个新文件,以使来自一个终端的命令不会破坏另一个终端。 您可以通过tty命令找到终端连接到的文件。

$ tty
/dev/ttys001 # on linux, this looks like: /dev/pts/0

Now you can do something funky: since shell reads from this file, you can get another shell to write to this file too, or clobber the shells together. Let’s try. (Remember how to redirect files from the process section above?)

现在,您可以做一些时髦的事情:由于shell从该文件读取,因此您也可以让另一个Shell写入该文件,或者将Shell一起破坏。 我们试试吧。 (还记得如何从上面的过程部分重定向文件吗?)

Open a second terminal. Type in:

打开第二个终端。 输入:

$ echo "Echhi" > /dev/ttys001 # replace /dev/ttys001 with your tty output
                            

Notice what happens in the first terminal.

注意在第一个终端中发生的事情。

Try echoing ls, the command to list files this time. Why doesn’t the first terminal run the command?

尝试回显ls ,这次是列出文件的命令。 为什么第一个终端不运行命令?

It doesn’t run the command because the stream writing to the terminal was the stdout of the second terminal, not the stdin stream of the first terminal.

它不运行命令,因为写入终端的流是第二终端的stdout ,而不是第一终端的stdin流。

Remember, only input coming in via stdin is passed as input to the shell. Everything else is just displayed on the screen. Even if it happens to be the same file in this case, it’s of no concern to the process.

请记住,只有通过stdin输入的输入才作为输入传递给shell 。 其他所有内容仅显示在屏幕上。 即使在这种情况下恰好是同一文件,也无需担心该过程。

The natural extension of the above then, is that when you redirect stdin, the commands should run. Sounds reasonable, let’s try it out.

上面内容的自然扩展是,当您重定向stdin ,命令应运行。 听起来很合理,让我们尝试一下。

Warning: One way to do this is bash < /dev/ttys001 . This doesn’t work too well, because there are now two processes expecting input from this one file.

警告:执行此操作的一种方法是bash < /dev/ttys001 。 这不太好用,因为现在有两个进程希望从该文件输入信息。

This is an undefined state, but on my Mac, one character went to one terminal, the other character went to the second, and this continued. Which was funny, because to exit the new shell, I had to type eexxiitt. And then I lost both shells.

这是一个未定义的状态,但是在我的Mac上,一个字符转到一个终端,另一个字符转到第二个终端,然后继续。 这很有趣,因为要退出新外壳,我必须键入eexxiitt 。 然后我丢了两个炮弹。

$ echo ls > ls.txt # write "ls" to a file
$ cat ls.txt # check what's in file
ls$ bash < ls.txt
Applications
Music
Documents
Downloads

There’s a nicer way to do this, which we’ll cover in a bit.

有一种更好的方法可以做到这一点,我们将在稍后介绍。

There’s something subtle going on here. How did this new bash process (which we’re starting from an existing bash process) know where to output things?

这里有些微妙的事情。 这个新的bash流程(我们从现有的bash流程开始)如何知道在哪里输出内容?

We never specified the output stream, only the input stream. This happens because processes inherit from their parent process.

我们从不指定输出流,仅指定输入流。 发生这种情况是因为进程从其父进程继承。

Every time you write a command on the terminal, the shell creates a duplicate process (via fork()).

每次在终端上编写命令时, shell都会创建一个重复进程(通过fork() )。

From man 2 fork:

man 2 fork

The child process has its own copy of the parent’s descriptors. These descriptors reference the same underlying objects, so that, for instance, file pointers in file objects are shared between the child and the parent, so that an lseek(2) on a descriptor in the child process can affect a subsequent read or write by the parent.

子进程具有其自己的父级描述符的副本。 这些描述符引用相同的基础对象,例如,子对象和父对象之间共享文件对象中的文件指针,以便子进程中的描述符上的lseek(2)可以影响后续的读写操作。父母。

This descriptor copying is also used by the shell to establish standard input and output for newly created processes as well as to set up pipes.

Shell也使用此描述符复制为新创建的进程建立标准输入和输出,以及建立管道。

Once forked, this new child process inherits the file descriptors from the parent, and then calls exec (execve()) to execute the command. This replaces the process image.

分叉后,这个新的子进程将从父级继承文件描述符,然后调用exec ( execve() )执行命令。 这将替换过程映像。

From man 3 exec:

man 3 exec

The functions described in this manual page are front ends for the function execve (2).

本手册页中描述的功能是功能execve (2)的前端。

From man 2 execve[8]:

来自第man 2 execve [8]:

File descriptors open in the calling process image remain open in the new process image, except for those for which the close-on-exec flag is set

在调用过程映像中打开的文件描述符在新过程映像中保持打开状态,但已设置close-on-exec标志的文件描述符除外

Thus, our file descriptors are the same as the original bash process, unless we change them via redirection.

因此,我们的文件描述符与原始bash进程相同,除非我们通过重定向对其进行更改。

While this child process is executing, the parent waits for the child to finish. When this happens, control is returned back to the parent process.

在执行此子进程时,父进程waits子进程完成。 发生这种情况时,控制权将返回给父进程。

Remember, the child process isn’t bash, but the process that replaced bash. With ls, the process returns as soon as it has output the list of files to stdout.

请记住,子进程不是bash ,而是替换了bash的进程。 使用ls ,该过程将文件列表输出到stdout立即返回。

Note: Not all commands on the shell result in a fork and exec. Ones that don’t are called built-in commands. Some are built-in out of necessity; since child processes can’t pass information back to parents, others to make things faster.

注意:并非Shell上的所有命令都会产生forkexec 。 那些不被称为内置命令的命令。 有些是出于必要内置的。 由于子进程无法将信息传递回父母,因此其他进程可以使事情更快。

For example, setting environment variables won’t work in a subshell, it can’t pass the value back to the parent shell. You can find the list here.

例如,设置环境变量将无法在子Shell中工作,它无法将值传递回父Shell。 您可以在此处找到列表

Here’s a demonstration I love.

这是我喜欢的示范。

Have you ever thought how weird it is that while something is running and outputting stuff to the terminal, you can write your next commands and have them work as soon as the existing process finishes?

您是否曾经想过,当某事正在运行并将东西输出到终端时,您可以编写下一条命令并在现有过程完成后立即使它们起作用,这有多奇怪?

$ sleep 10;
ls
cat b.txt
brrr
# I stop typing here
$ ls
b c y$ cat b.txt
defbjehb$ brrr
-bash: brrr: command not found

It’s only the process that’s blocked, the input stream is still accepting data. Since the file we’re reading/writing to is the same (tty), we see what we type, and when the sleep 10; returns, the shell creates another process for ls, waits again, then the same for cat b.txt, and then again for brrr.

只是被阻塞的过程,输入流仍在接受数据。 由于我们正在读取/写入的文件是相同的( tty ),因此我们可以看到键入的内容以及何时进入sleep 10; 返回时,shell为ls创建另一个进程,再次等待,然后对cat b.txt相同,然后再次为brrr

I used sleep 10; to demonstrate because the other commands happen too quickly for me to type anything before control returns to the parent bash process.

我用了sleep 10; 进行演示,因为其他命令发生得太快,以至于我无法在控件返回到父bash进程之前键入任何内容。

Now is a good time to try out the exec built-in command (it replaces the current process so it will kill your shell session).

现在是尝试exec内置命令的好时机(它将替换当前进程,因此它将终止您的shell会话)。

exec echo Bye
                            

echo is a built-in command too.

echo也是内置命令。

If you’d like to implement the shell yourself in C, here’s a resource I recommend.

如果您想用C自己实现shell, 我建议您使用以下资源

管道 (The Pipe)

Armed with the knowledge of how shell works, we can venture into the world of the pipe: |.

与如何壳工程的知识,我们可以大胆进入管道的世界: |

It bridges two processes together, and the way it works is interesting.

它把两个过程联系在一起,并且它的工作方式很有趣。

Remember the philosophy we began with? Do one thing, and do it well. Now that all our utilities work well, how do we make them work together?

还记得我们开始时的哲学吗? 做一件事,并做好。 现在,我们所有的公用程序都运行良好,如何使它们一起工作?

This is where the pipe, |, pipes in. It represents the system call to pipe() and all it does is redirect stdin and stdout for processes.

这是管道, | ,通过管道输入。它表示对pipe()的系统调用,它所做的只是重定向进程的stdinstdout

Since things have been designed so well, this otherwise complex function reduces to just this. Whenever you’re working with pipes, or anything on the terminal, just imagine how the input and output files are set up and you’ll never have a problem.[9]

由于事物的设计是如此出色,因此原本复杂的功能可以简化为此。 每当您使用管道或终端上的任何东西时,只要想象一下如何设置输入和输出文件,就永远不会有问题。[9]

Let’s start with a nicer way to direct input to bash, instead of using a temp file like we did earlier (ls.txt).

让我们从一种更好的方法将输入直接定向到bash开始,而不是像以前那样使用临时文件( ls.txt )。

$ echo ls | bash
Applications
Music
Documents
Downloads
Image for post

This image is a bit of a simplification to explain the pipe redirection. You know how the shell works now, so you know that the top bash forks another bash connected to tty, which produces the output of ls.

该图像有些简化,无法说明管道重定向。 您知道外壳现在是如何工作的,因此您知道顶部bash分叉了另一个连接到tty bash ,后者生成了ls的输出。

You also know that the top bash was forked from the lower one, which is why it inherited the file descriptors of the lower one. You also know that the lower bash didn’t fork a new process because echo is a built-in command.

您还知道顶部bash是从底部bash派生的,这就是为什么它继承了底部bash的文件描述符的原因。 您还知道较低的bash不会派生新进程,因为echo是内置命令。

Lets wind up this section with a more complex example:

让我们以一个更复杂的示例结束本节:

$ ls -ail | sort -nr -k 6 | head -n 1 | cut -f 9 -d ' '
2656

This pipeline figures out the largest file in the current directory and outputs its size.

该管道找出当前目录中最大的文件并输出其大小。

There’s probably a more elegant way to do this which is just one Google search away but this works well as an example. Who knew this was built into ls already?

也许有一种更优雅的方法可以做到这一点, 仅需一次Google搜索即可,但这很好地举例说明了。 谁知道这已经内置到ls了?

Image for post

Notice how stderr is always routed directly to tty? What if you wanted to redirect stderr instead of stdout to the pipe? You can switch streams before the pipe.

注意, stderr是如何始终直接路由到tty ? 如果要重定向stderr而不是stdout到管道怎么办? 您可以在管道之前切换流。

$ error-prone-command 2>&1 >/dev/null
                            

Source: this beauty.

资料来源: 这位美女

关于PATH的一切 (Everything About PATHs)

Local variables are ones you can create in a shell. They’re local to the shell, thus not passed to children. (Remember, every non-built-in command is in a new shell which doesn’t have these local variables.)

局部变量是可以在shell中创建的变量。 它们位于外壳的本地,因此不会传递给子级。 (请记住,每个非内置命令都在没有这些局部变量的新外壳中。)

Environment variables (env vars) are like global variables. They are passed to children. However, changes to the environment variables in child process can’t be passed to the parent. Remember, there’s no communication between child and parent except the exit code.

环境变量( env vars)类似于全局变量。 他们被传递给孩子。 但是,子进程中对环境变量的更改无法传递给父进程。 请记住,除了退出代码外,子代与父代之间没有任何通信。

Try this: Call bash from bash from bash. The first bash is waiting on the second bash to exit, while the second one is waiting for the third one.

试试这个:呼叫bashbashbash 。 第一个bash正在等待第二个bash退出,而第二个bash正在等待第三个bash

When you call exec, the exit happens automatically. If not, you want to type exit yourself to send the exit code to the parent. Exit twice, and you’re back to original.

当您调用exec ,退出将自动发生。 如果不是,则您要自己键入exit以将退出代码发送给父级。 退出两次,您将恢复到原始状态。

Now, thought experiment: What happens when you type ls into a shell? You know the fork(), exec(), and wait() cycle that occurs, along with tty.

现在,进行思想实验:将ls输入到shell中时会发生什么? 您知道tty会发生fork()exec()wait()循环。

But, even before this happens, ls is just another utility function, right? Which means there’s a program file somewhere that has the C code that does fork() and everything else.

但是,即使在发生这种情况之前, ls只是另一个实用程序功能,对吗? 这意味着某个地方有一个程序文件,其中包含执行fork()和其他所有功能的C代码。

This binary isn’t in your current directory (you can check with ls -a). It would’ve made sense to me that if these files were in my current directory, I could execute them by typing their name in the shell. They’re executable files.

该二进制文件不在当前目录中(可以使用ls -a检查)。 对我来说,如果这些文件在我的当前目录中,我可以通过在shell中键入它们的名称来执行它们。 它们是可执行文件。

Where is the ls program file, exactly?

ls程序文件在哪里?

Remember how the file system tree is hierarchical? There’s an order to the madness.

还记得文件系统树是如何分层的吗? 疯狂有秩序。

All the base level directories have a specific function. For example, all Unix utilities and some extra programs go into the /bin directory. Bin stands for binaries. There’s a million tutorials if you want to find out more.

所有基本级别目录都具有特定功能。 例如,所有Unix实用程序和一些其他程序都进入/bin目录。 Bin代表二进制文件。 如果您想了解更多信息,则有一百万本教程。

This is enough knowledge for us. ls lives in /bin. So, you can do this:

这对我们来说已经足够了。 ls住在/bin 。 因此,您可以这样做:

$ /bin/ls
a b c

Which is same as running ls.

与运行ls相同。

But, how did the shell know to look for ls in bin?

但是,shell如何知道在bin寻找ls

This is where the magical environment variable, PATH comes in. Let’s look at it first.

这就是神奇的环境变量PATH出现的地方。让我们首先来看它。

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

And, to see all environment variables, we can do:

并且,要查看所有环境变量,我们可以执行以下操作:

$ env
HOSTNAME=12345XXXX
TERM=xterm
TMPDIR=/tmp
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/test
LANG=en_US.UTF-8
SHLVL=1
HOME=/root
LANGUAGE=en_US:en
LESSOPEN=||/usr/bin/lesspipe.sh %s
container=oci
_=/bin/env

The PATH is a colon-separated list of directories. When the shell sees a command without an absolute path, it looks in this $PATH environment variable, goes to each directory in order, and tries to find the file in there. It executes the first file it finds.

PATH是用冒号分隔的目录列表。 当外壳程序看到没有绝对路径的命令时,它将在此$PATH环境变量中查找,依次进入每个目录,并尝试在其中查找文件。 它执行找到的第一个文件。

Notice how /bin is in the PATH, which is why ls just works.

注意/binPATH ,这就是ls起作用的原因。

What happens if I remove everything from the PATH? Nothing should work without an absolute path.

如果我从PATH删除所有内容,会发生什么? 没有绝对的路径,任何事情都不应该工作。

$ PATH='' ls
-bash: ls: No such file or directory

The above syntax is used to set the environment variables for just one command. The old PATH value still exists in the shell.

上面的语法仅用于设置一个命令的环境变量。 旧的PATH值仍然存在于外壳中。

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Note: Doing PATH='' echo $PATH wouldn’t work since echo is a shell built-in. However, if you started a new shell process with PATH='' and then did echo, it would work.

注意: 进行PATH='' echo $PATH无效,因为echo是内置的shell。 但是,如果您使用PATH=''开始一个新的shell进程,然后执行echo ,那么它将起作用。

$ (PATH=''; echo $PATH)

$ ( PATH = ''; echo $PATH )

The () is syntax for a new subshell. I know, it’s a lot of information I’m not explaining first, but it’s all syntax level and just one Google search away. On the plus side, it ensures that this blog post doesn’t turn into a book.

()是新子shell的语法。 我知道,我首先不解释很多信息,但这全是语法级别,只有一个Google搜索。 从好的方面来说,它可以确保该博客文章不会变成一本书。

Have you heard that ./ is the way to run files you create? Why can’t you just run them like bash ? Now you know. When you do ./, that’s an exact path to the file you want to execute. bash works because it’s on the PATH.

您是否听说过./是运行所创建文件的方式? 您为什么不能像bash一样运行它们? 现在你知道了。 当您执行./ ,这就是您要执行的文件的确切路径。 bash之所以起作用是因为它在PATH

So, it makes sense that if the current directory were always on the PATH, your scripts would work by name.

因此,有意义的是,如果当前目录始终位于PATH ,则脚本将按名称运行。

Let’s try this.

让我们尝试一下。

$ vim script.sh
# echo "I can work from everywhere?"
$ chmod a+x script.sh
$ ls
script.sh$ script.sh
-bash: script.sh: command not found # not on PATH$ ./script.sh # path to file is defined
I can work from everywhere?

Now, let’s add it to current PATH. And then run just script.sh.

现在,让我们将其添加到当前的PATH 。 然后只运行script.sh

$ PATH=${PATH}:. script.sh # this appends . to PATH, only for this command
I can work from everywhere?
$ export PATH=${PATH}:. # this sets the PATH variable permanently to include .$ script.sh # calling script.sh without the ./
I can work from everywhere?$ cd .. # go one directory up
$ script.sh # this shows that PATH directories aren't searched recursively
-bash: script.sh: command not found # so script doesn't run anymore

Warning: It’s bad practice to include the current directory in your PATH. There are several problems. You can never be sure that execution of any command acts as intended.

警告:在您的PATH中包含当前目录是一种不好的做法。 有几个问题。 您永远不能确保任何命令的执行都按预期执行。

What if you have a binary called ls which is a virus in your current directory (downloaded from the internet), but you meant to do /bin/ls?

如果您有一个名为ls的二进制文件,该文件在当前目录中是一种病毒(可从Internet下载),但您打算执行/bin/ls怎么办?

Read more

阅读更多

Next time you see a “no such file or directory” error, when you know the file exists (maybe you just installed it) — you know what the issue is. The PATH is busted!

下次当您看到“没有这样的文件或目录”错误时,当您知道文件存在(也许您刚刚安装了该文件)时,便知道问题出在哪里。 PATH被破坏!

It’s installed to a location not on your PATH so you can only call it from the place you installed it to. To fix this, you now know that you can either add that directory to the PATH, or call the file via its absolute path.

它安装在PATH的位置,因此只能从安装它的位置进行调用。 要解决此问题,您现在知道可以将该目录添加到PATH ,也可以通过其绝对路径调用该文件。

Fun fact: Python has a similar structure when searching for imports, which uses the PYTHONPATH env variable.

有趣的事实: 搜索导入时,Python具有类似的结构,该结构使用PYTHONPATH env变量。

编写Shell脚本 (Writing Shell Scripts)

This post has already become much longer than I expected. Also, programming with shell has been covered online a lot. But for completeness’ sake, here’s a link to the manual, and a decent tutorial.

这篇文章的长度已经超出了我的预期。 此外,使用Shell进行编程已经在网上广泛讨论了。 但是为了完整起见,这是手册链接不错的教程

包装经理 (Package Managers)

Let’s say you’ve written a new tool. It works really well on your machine and now you want to sell it to other users. Wait, I mean, in the spirit of open source, you want to make it available for others to use.

假设您已经编写了一个新工具。 它在您的计算机上确实运行良好,现在您想将其出售给其他用户。 等等,我的意思是,本着开源的精神,您想将其提供给他人使用。

You also want to save them the PATH headaches. Better yet, you want things to be installed in the right place: the binary goes into /usr/bin/ (it’s already on PATH) and the dependencies go somewhere where the main binary can find it.

您还希望节省他们的PATH麻烦。 更好的是,您希望将事情安装在正确的位置:二进制文件进入/usr/bin/ (它已经在PATH ),而依赖项则位于主二进制文件可以找到它的地方。

Package managers solve exactly this problem. Instead of giving you headaches, they just make things work™.

包管理器正好解决了这个问题。 他们没有让您头疼,而是使事情正常进行。

There are three main package managers I know of: dpkg, rpm, and homebrew. Each of them works on a different Linux distribution (if you’re not sure what that means, it’s in the next section).

我知道三个主要的软件包管理器: dpkgrpmhomebrew 。 它们每个都在不同的Linux发行版上工作(如果您不确定这意味着什么,请参见下一节)。

But, there are hundreds of them in the wild, just like the number of distributions.

但是,就像分发的数量一样,它们有数百种在野外。

dpkg is the Debian package manager, but you’ve probably heard of a very useful tool built on top of it to manage packages: apt.

dpkgDebian软件包管理器,但是您可能已经听说过在它之上构建的用于管理软件包的非常有用的工具: apt

Every time you use apt install to install a new package, you’re leveraging the power of this package manager which ensures things end up where they need to be.

每次使用apt install来安装新软件包时,您都在利用此软件包管理器的功能,以确保一切都在需要的地方进行。

On the development side, this means ensuring that the tool you’re creating is compatible with the package manager. For example, here’s how to do it in C and Python.

在开发方面,这意味着确保要创建的工具与程序包管理器兼容。 例如,这是在CPython中执行此操作的方法。

rpm is the Red Hat package manager, which also has a useful tool built on top: yum, which takes care of dependencies too.

rpmRed Hat软件包管理器,它在顶部还有一个有用的工具: yum ,它也处理依赖项。

homebrew is the package manager on macOS, and you’re using it every time you brew install something.

homebrew是macOS上的软件包管理器,每次您brew install某些东西时都在使用它。

They make life easy.

它们使生活变得轻松。

They’re so convenient that programming languages have their own package managers too! For example, pip is a popular Python tool installer. There’s bundler for Ruby, cocoa for Swift/iOS, and several others.

它们是如此方便,以至于编程语言也具有自己的包管理器! 例如, pip是流行的Python工具安装程序。 有用于Rubybundler器,用于Swift / iOS的cocoa和其他几个bundler器。

Unix的简要历史 (Brief History of Unix)

Unix was the first-of-its-kind operating system that allowed multiple users to use it, and every user could run more than one program at the same time.

Unix是第一个允许多种用户使用它的操作系统,每个用户可以同时运行多个程序。

This might sound trivial now, since almost every operating system has this, but it was revolutionary when it first came out. The days of leasing time on a big mainframe were over. You could let your program run in the background while someone else did their work.

现在这听起来似乎微不足道,因为几乎每个操作系统都具有此功能,但是它首次问世时是革命性的。 在大型主机上租赁时间的日子已经过去。 您可以让程序在后台运行,而其他人来做。

As the definition goes, Unix is a multi-user multi-tasking operating system.

顾名思义,Unix是一个多用户多任务操作系统。

This was proprietary, and AT&T was the only company that could sell it. (Bell labs developed it in the 1970s). They opted for a licensing model, and soon came out with a specification, called the Single UNIX Specification.

这是专有的,AT&T是唯一可以出售它的公司。 (贝尔实验室在1970年代开发了它)。 他们选择了许可模型,并很快提出了一个规范,称为Single UNIX Specification

These were a set of guidelines, and any system that followed them could be certified as a Unix system.

这些是一组准则,遵循它们的任何系统都可以被认证为Unix系统。

Around the same time, some people were unhappy with the proprietary nature of Unix, and came up with another open source operating system kernel called Linux.

大约在同一时间,有些人对Unix的专有性不满意,并提出了另一个名为Linux的开源操作系统内核。

Inspired by the Unix philosophy, and to enable portability, these systems adhere to the POSIX standard, which is a subset of Unix. These systems are thus also called Unix-like.[10]

受Unix理念的启发,并为了实现可移植性,这些系统遵循POSIX标准 ,后者是Unix的子集。 这些系统因此也被称为类Unix。[10]

Things get a bit confusing here. Linux is a family of operating systems based on the Linux kernel. There is no single operating system called Linux.

事情变得有些混乱。 Linux是基于Linux内核的一系列操作系统。 没有称为Linux的单一操作系统。

What we do have instead, are Debian, Ubuntu, Fedora, CentOS, Red Hat, Gentoo, etc. These are distributions (popularly called distros) of the Linux kernel. Full-fledged operating systems.

相反,我们所拥有的是Debian,Ubuntu,Fedora,CentOS,Red Hat,Gentoo等。这些是Linux内核的发行版 (通常称为发行版 )。 完善的操作系统。

What’s the difference? Some are built for a specific purpose (for example: Kali Linux comes built with security testing tools).

有什么不同? 有些是为特定目的而构建的(例如:Kali Linux随附了安全测试工具)。

Most differ in package management, how often packages are updated, and security.

包管理,包更新频率和安全性方面的大多数差异。

If you’d like to know more, visit opensource.com.

如果您想了解更多, 请访问opensource.com

Fun fact: Mac OS X is Unix certified.

有趣的事实: Mac OS X已获得Unix认证。

结论 (Conclusion)

We’ve covered a lot. Let’s take a moment to put it all together.

我们已经介绍了很多。 让我们花一点时间将它们放在一起。

Unix is a full-fledged operating system, and Linux is a kernel — the core of the operating system — inspired by Unix. They focus on doing one thing, and doing it well.

Unix是成熟的操作系统,Linux是受Unix启发的内核(操作系统的核心)。 他们专注于做一件事情,并做好。

Everything is either a process or a file. The kernel is the core, which exposes system calls, which utilities leverage. Processes work with files as input and output. We have control over these files, we can redirect them, and it wouldn’t make a difference to the process.

一切都是过程或文件。 内核是核心,它公开了系统调用,这些实用程序可以利用它们。 进程使用文件作为输入和输出。 我们可以控制这些文件,可以重定向它们,并且对过程没有影响。

The pipe can redirect output from one process into the input for another. Every command from shell first forks, then execs, and returns the exit code to the waiting parent.

管道可以将输出从一个过程重定向到另一个过程的输入中。 Shell中的每个命令首先派生,然后执行,然后将退出代码返回给等待的父对象。

There’s a lot more. If it were possible to do a lossless compression, I would’ve done it in the post.

还有更多。 如果有可能进行无损压缩,那么我会在后期进行。

So, welcome to the wild, you’re good to go.

所以,欢迎来到野外,你很好。

exit

exit

Thanks to Vatika Harlalka, Nishit Asnani, Hemanth K. Veeranki, and Hung Hoang for reading drafts of this.

感谢Vatika Harlalka, Nishit AsnaniHemanth K. VeerankiHung Hoang阅读了此草稿。

脚注 (Footnotes)

  1. Did this just become a poem?

    这只是成为一首诗吗?
  2. The terminal has subprocesses running as well, like the shell. You can look at all running processes via ps -ef.

    终端也像shell一样运行着子进程。 您可以通过ps -ef查看所有正在运行的进程。

  3. A shell is the interface you use to interact with the operating system. It can be both a command line interface (CLI) and graphical user interface (GUI). In this post, we focus just on the CLI. When you open a terminal, the default program that greets you is a shell. Read more

    shell是用于与操作系统交互的接口。 它既可以是命令行界面(CLI),也可以是图形用户界面(GUI)。 在本文中,我们仅关注CLI。 打开终端时,欢迎您的默认程序是shell阅读更多

  4. This isn’t 100% right, there’s a bit more nuance to it, which we’ll get to soon.

    这不是100%正确的,还有更多细微差别,我们将尽快解决。
  5. I’ve been spending too much time with iPhones and iOS. It’s an inode, not an iNode.

    我在iPhone和iOS上花费了太多时间。 它是一个索引节点,而不是一个iNode。
  6. Also, the time when I started writing this guide. About time I finished it. Notice the year.

    另外,这是我开始编写本指南的时间。 大概我完成了。 注意年份。
  7. I got all this information from man 5 passwd.

    我从man 5 passwd获得了所有这些信息。

  8. The manual has eight sections, each for a specific purpose. Read more.

    本手册分为八个部分,每个部分用于特定目的。 阅读更多。

  9. This is a powerful idea, which is true only because Unix, by design, says: “Everything is a file”.

    这是一个强大的想法,只有在Unix设计上说:“一切都是文件”,这才是正确的。
  10. Fun fact — they can’t be called Unix because they haven’t been certified, and Unix is a trademark.

    有趣的事实-由于它们尚未通过认证,因此不能称为Unix,并且Unix是商标。

翻译自: https://medium.com/better-programming/how-unix-works-everything-you-were-too-afraid-to-ask-f8396aeb2763

unix cp 原理

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值