linux命令 awk_如何在Linux上使用awk命令

最新推荐文章于 2023-05-15 18:58:00 发布

culintai3473

最新推荐文章于 2023-05-15 18:58:00 发布

阅读量618

点赞数 1

文章标签： linux python java 正则表达式 shell

原文链接：https://www.howtogeek.com/562941/how-to-use-the-awk-command-on-linux/

版权

linux命令 awk

A Linux laptop with lines of code in a terminal window. — Fatmawati Achmad Zaenuri/Shutterstock Fatmawati Achmad Zaenuri / Shutterstock

On Linux, awk is a command-line text manipulation dynamo, as well as a powerful scripting language. Here’s an introduction to some of its coolest features.

在Linux上， awk是一种命令行文本处理发电机，也是一种功能强大的脚本语言。这是一些最酷功能的介绍。

awk的名称如何 (How awk Got Its Name)

The awk command was named using the initials of the three people who wrote the original version in 1977: Alfred Aho, Peter Weinberger, and Brian Kernighan. These three men were from the legendary AT&T Bell Laboratories Unix pantheon. With the contributions of many others since then, awk has continued to evolve.

awk命令使用1977年写原始版本的三个人的名字缩写来命名： Alfred Aho ， Peter Weinberger和Brian Kernighan 。这三个人来自传说中的AT＆T 贝尔实验室 Unix万神殿。自那时以来，在许多其他人的贡献下， awk不断发展。

It’s a full scripting language, as well as a complete text manipulation toolkit for the command line. If this article whets your appetite, you can check out every detail about awk and its functionality.

它是一种完整的脚本语言，以及用于命令行的完整文本操作工具包。如果本文引起您的胃口，您可以查看有关awk及其功能的所有详细信息。

规则，模式和动作 (Rules, Patterns, and Actions)

awk works on programs that contain rules comprised of patterns and actions. The action is executed on the text that matches the pattern. Patterns are enclosed in curly braces ({}). Together, a pattern and an action form a rule. The entire awk program is enclosed in single quotes (').

awk适用于包含规则(由模式和操作组成)的程序。该操作在与模式匹配的文本上执行。模式用大括号( {} )括起来。模式和动作共同构成规则。整个awk程序都用单引号( ' )引起来。

Let’s take a look at the simplest type of awk program. It has no pattern, so it matches every line of text fed into it. This means the action is executed on every line. We’ll use it on the output from the who command.

让我们看一下最简单的awk程序类型。它没有模式，因此它匹配输入到其中的每一行文本。这意味着该动作在每一行上执行。我们将在who命令的输出中使用它。

Here’s the standard output from who:

这是who的标准输出：

who

Perhaps we don’t need all of that information, but, rather, just want to see the names on the accounts. We can pipe the output from who into awk, and then tell awk to print only the first field.

也许我们不需要所有这些信息，而是只想查看帐户上的名称。我们可以将who的输出传递给awk ，然后告诉awk只打印第一个字段。

By default, awk considers a field to be a string of characters surrounded by whitespace, the start of a line, or the end of a line. Fields are identified by a dollar sign ($) and a number. So, $1 represents the first field, which we’ll use with the print action to print the first field.

默认情况下， awk认为字段是由空格，行首或行尾包围的字符串。字段由美元符号( $ )和数字标识。因此， $1代表第一个字段，我们将在print操作中使用它来打印第一个字段。

We type the following:

我们输入以下内容：

who | awk '{print $1}'

The "who | awk '{print $1}'" command in a terminal window.

awk prints the first field and discards the rest of the line.

awk打印第一个字段，并丢弃其余行。

We can print as many fields as we like. If we add a comma as a separator, awk prints a space between each field.

我们可以根据需要打印任意多个字段。如果我们添加逗号作为分隔符， awk将在每个字段之间打印一个空格。

We type the following to also print the time the person logged in (field four):

我们输入以下内容以打印该人登录的时间(第四栏)：

who | awk '{print $1,$4}'

The "who | awk '{print $1,$4}'" command in a terminal window.

There are a couple of special field identifiers. These represent the entire line of text and the last field in the line of text:

有几个特殊的字段标识符。这些代表整个文本行和文本行中的最后一个字段：

$0: Represents the entire line of text.
$ 0 ：代表整行文本。
$1: Represents the first field.
$ 1 ：表示第一个字段。
$2: Represents the second field.
$ 2 ：表示第二个字段。
$7: Represents the seventh field.
$ 7 ：代表第七字段。
$45: Represents the 45th field.
$ 45 ：表示第45个字段。
$NF: Stands for “number of fields,” and represents the last field.
$ NF ：代表“字段数”，代表最后一个字段。

We’ll type the following to bring up a small text file that contains a short quote attributed to Dennis Ritchie:

我们将键入以下内容以调出一个小型文本文件，该文件包含归因于Dennis Ritchie的短引号：

cat dennis_ritchie.txt

The "cat dennis_ritchie.txt" command in a terminal window.

We want awk to print the first, second, and last field of the quote. Note that although it’s wrapped around in the terminal window, it’s just a single line of text.

我们希望awk打印报价的第一，第二和最后一个字段。请注意，尽管它在终端窗口中环绕，但这只是一行文本。

We type the following command:

我们输入以下命令：

awk '{print $1,$2,$NF}' dennis_ritchie.txt

$The "awk '{print $1,$2,$NF}' dennis_ritchie.txt" command in a terminal window.$

We don’t know that “simplicity.” is the 18th field in the line of text, and we don’t care. What we do know is it’s the last field, and we can use $NF to get its value. The period is just considered another character in the body of the field.

我们不知道这种“简单性”。是文本行中的第18个字段，我们不在乎。我们所知道的是这是最后一个字段，我们可以使用$NF来获取其值。句点仅被视为字段正文中的另一个字符。

添加输出字段分隔符 (Adding Output Field Separators)

You can also tell awk to print a particular character between fields instead of the default space character. The default output from the date command is slightly peculiar because the time is plonked right in the middle of it. However, we can type the following and use awk to extract the fields we want:

您还可以告诉awk在字段之间打印特定字符，而不是默认空格字符。 date 命令的默认输出有些特殊，因为时间恰好位于中间。但是，我们可以键入以下内容，并使用awk提取所需的字段：

date

date | awk '{print $2,$3,$6}'

The "date" and "date | awk '{print $2,$3,$6}'" command in a terminal window.

We’ll use the OFS (output field separator) variable to put a separator between the month, day, and year. Note that below we enclose the command in single quotes ('), not curly braces ({}):

我们将使用OFS (输出字段分隔符)变量在月，日和年之间放置分隔符。请注意，下面我们将命令用单引号( ' )而不是大括号( {} )括起来：

date | awk 'OFS="/" {print$2,$3,$6}'

date | awk 'OFS="-" {print$2,$3,$6}'

The "date | awk 'OFS="/" {print$2,$3,$6}'" and "date | awk 'OFS="-" {print$2,$3,$6}'" commands in a terminal window.

BEGIN和END规则 (The BEGIN and END Rules)

A BEGIN rule is executed once before any text processing starts. In fact, it’s executed before awk even reads any text. An END rule is executed after all processing has completed. You can have multiple BEGIN and END rules, and they’ll execute in order.

在开始任何文本处理之前，将执行一次BEGIN规则。实际上，它是在awk甚至读取任何文本之前执行的。所有处理完成后，将执行END规则。您可以有多个BEGIN和END规则，它们将按顺序执行。

For our example of a BEGIN rule, we’ll print the entire quote from the dennis_ritchie.txt file we used previously with a title above it.

对于我们的BEGIN规则示例，我们将打印先前使用的dennis_ritchie.txt文件中的所有引号，并在其dennis_ritchie.txt加上标题。

To do so, we type this command:

为此，我们键入以下命令：

awk 'BEGIN {print "Dennis Ritchie"} {print $0}' dennis_ritchie.txt

$The "awk 'BEGIN {print "Dennis Ritchie"} {print $0}' dennis_ritchie.txt" command in a terminal window.$

Note the BEGIN rule has its own set of actions enclosed within its own set of curly braces ({}).

请注意， BEGIN规则在其自己的花括号( {} )中包含其自己的动作集。

We can use this same technique with the command we used previously to pipe output from who into awk. To do so, we type the following:

我们可以将相同的技术与之前使用的命令一起使用，以将输出从who传递到awk 。为此，我们键入以下内容：

who | awk 'BEGIN {print "Active Sessions"} {print $1,$4}'

The "who | awk 'BEGIN {print "Active Sessions"} {print $1,$4}'" command in a terminal window.

输入场分隔符 (Input Field Separators)

If you want awk to work with text that doesn’t use whitespace to separate fields, you have to tell it which character the text uses as the field separator. For example, the /etc/passwd file uses a colon (:) to separate fields.

如果您希望awk使用不使用空格分隔字段的文本，则必须告诉它文本将哪个字符用作字段分隔符。例如， /etc/passwd文件使用冒号( : )，以分开的字段。

We’ll use that file and the -F (separator string) option to tell awk to use the colon (:) as the separator. We type the following to tell awk to print the name of the user account and the home folder:

我们将使用该文件和-F (分隔符字符串)选项来告诉awk使用冒号( : )作为分隔符。我们输入以下内容告诉awk打印用户帐户和主文件夹的名称：

awk -F: '{print $1,$6}' /etc/passwd

The output contains the name of the user account (or application or daemon name) and the home folder (or the location of the application).

输出包含用户帐户的名称(或应用程序或守护程序名称)和主文件夹(或应用程序的位置)。

Output from the "awk -F: '{print $1,$6}' /etc/passwd" command in a terminal window.

添加图案 (Adding Patterns)

If all we’re interested in are regular user accounts, we can include a pattern with our print action to filter out all other entries. Because User ID numbers are equal to, or greater than, 1,000, we can base our filter on that information.

如果我们只对普通用户帐户感兴趣，则可以在打印操作中包含一个模式，以过滤掉所有其他条目。因为用户ID号等于或大于1,000，所以我们可以基于该信息进行过滤。

We type the following to execute our print action only when the third field ($3) contains a value of 1,000 or greater:

仅当第三个字段( $3 )的值等于或大于1,000时，我们才键入以下内容来执行打印操作：

awk -F: '$3 >= 1000 {print $1,$6}' /etc/passwd

The "awk -F: '$3 >= 1000 {print $1,$6}' /etc/passwd" command in a terminal window.

The pattern should immediately precede the action with which it’s associated.

该模式应紧接与其关联的动作之前。

We can use the BEGIN rule to provide a title for our little report. We type the following, using the (\n) notation to insert a newline character into the title string:

我们可以使用BEGIN规则为我们的小报告提供标题。我们使用( \n )符号键入以下内容，以在标题字符串中插入换行符：

awk -F: 'BEGIN {print "User Accounts\n-------------"} $3 >= 1000 {print $1,$6}' /etc/passwd

$The "awk -F: 'BEGIN {print "User Accounts\n-------------"} $3 >= 1000 {print $1,$6}' /etc/passwd" command in a terminal window.$

Patterns are full-fledged regular expressions, and they’re one of the glories of awk.

模式是成熟的正则表达式，它们是awk的荣耀之一。

Let’s say we want to see the universally unique identifiers (UUIDs) of the mounted file systems. If we search through the /etc/fstab file for occurrences of the string “UUID,” it ought to return that information for us.

假设我们要查看已挂载文件系统的通用唯一标识符(UUID)。如果我们在/etc/fstab文件中搜索字符串“ UUID”的出现，则应该为我们返回该信息。

We use the search pattern “/UUID/” in our command:

我们在命令中使用搜索模式“ / UUID /”：

awk '/UUID/ {print $0}' /etc/fstab

The "awk '/UUID/ {print $0}' /etc/fstab" command in a terminal window.

It finds all occurrences of “UUID” and prints those lines. We actually would’ve gotten the same result without the print action because the default action prints the entire line of text. For clarity, though, it’s often useful to be explicit. When you look through a script or your history file, you’ll be glad you left clues for yourself.

它查找所有出现的“ UUID”并打印这些行。实际上，如果没有print操作，我们将获得相同的结果，因为默认操作会打印整个文本行。但是，为了清楚起见，明确表示通常很有用。浏览脚本或历史记录文件时，您会为自己留下的线索感到高兴。

The first line found was a comment line, and although the “UUID” string is in the middle of it, awk still found it. We can tweak the regular expression and tell awk to process only lines that start with “UUID.” To do so, we type the following which includes the start of line token (^):

找到的第一行是注释行，尽管“ UUID”字符串位于中间，但awk仍然找到它。我们可以调整正则表达式，并告诉awk仅处理以“ UUID”开头的行。为此，我们键入以下内容，其中包括行标记的开头( ^ )：

awk '/^UUID/ {print $0}' /etc/fstab

$The "awk '/^UUID/ {print $0}' /etc/fstab" command in a terminal window.$

That’s better! Now, we only see genuine mount instructions. To refine the output even further, we type the following and restrict the display to the first field:

那更好！现在，我们只看到真正的安装说明。为了进一步优化输出，我们键入以下内容，并将显示限制在第一个字段：

awk '/^UUID/ {print $1}' /etc/fstab

$The "awk '/^UUID/ {print $1}' /etc/fstab" command in a terminal window.$

If we had multiple file systems mounted on this machine, we’d get a neat table of their UUIDs.

如果我们在这台机器上安装了多个文件系统，我们将得到一个整齐的表，列出它们的UUID。

内建功能 (Built-In Functions)

awk has many functions you can call and use in your own programs, both from the command line and in scripts. If you do some digging, you’ll find it very fruitful.

awk具有许多功能，您可以从命令行和脚本中在自己的程序中调用和使用这些功能。如果您进行一些挖掘，将会发现它非常富有成果。

To demonstrate the general technique to call a function, we’ll look at some numeric ones. For example, the following prints the square root of 625:

为了演示调用函数的一般技术，我们将看一些数字函数。例如，以下显示了625的平方根：

awk 'BEGIN { print sqrt(625)}'

This command prints the arctangent of 0 (zero) and -1 (which happens to be the mathematical constant, pi):

此命令显示反正切为0(零)和-1(正好是数学常数pi)：

awk 'BEGIN {print atan2(0, -1)}'

In the following command, we modify the result of the atan2() function before we print it:

在以下命令中，我们在打印之前修改atan2()函数的结果：

awk 'BEGIN {print atan2(0, -1)*100}'

Functions can accept expressions as parameters. For example, here’s a convoluted way to ask for the square root of 25:

函数可以接受表达式作为参数。例如，这是一种要求25的平方根的复杂方法：

awk 'BEGIN { print sqrt((2+3)*5)}'

The "awk 'BEGIN { print sqrt(625)}'" command in a terminal window.

awk脚本 (awk Scripts)

If your command line gets complicated, or you develop a routine you know you’ll want to use again, you can transfer your awk command into a script.

如果命令行变得复杂，或者您开发了要再次使用的例程，则可以将awk命令转换为脚本。

In our example script, we’re going to do all of the following:

在示例脚本中，我们将执行以下所有操作：

Tell the shell which executable to use to run the script.
告诉外壳程序使用哪个可执行文件来运行脚本。
Prepare awk to use the FS field separator variable to read input text with fields separated by colons (:).
准备awk使用FS字段分隔符变量读取输入的文本和字段分离用冒号( : )。
Use the OFS output field separator to tell awk to use colons (:) to separate fields in the output.
使用OFS输出字段分隔符告诉awk到使用冒号( : )，以输出不同的领域。
Set a counter to 0 (zero).
将计数器设置为0(零)。
Set the second field of each line of text to a blank value (it’s always an “x,” so we don’t need to see it).
将每行文本的第二个字段设置为空白值(它始终为“ x”，因此我们无需查看它)。
Print the line with the modified second field.
打印带有修改后的第二个字段的行。
Increment the counter.
递增计数器。
Print the value of the counter.
打印计数器的值。

Our script is shown below.

我们的脚本如下所示。

The BEGIN rule carries out the preparatory steps, while the END rule displays the counter value. The middle rule (which has no name, nor pattern so it matches every line) modifies the second field, prints the line, and increments the counter.

BEGIN规则执行准备步骤，而END规则显示计数器值。中间规则(没有名称，也没有模式，因此它与每一行都匹配)修改第二个字段，打印该行并增加计数器。

The first line of the script tells the shell which executable to use (awk, in our example) to run the script. It also passes the -f (filename) option to awk, which informs it the text it’s going to process will come from a file. We’ll pass the filename to the script when we run it.

脚本的第一行告诉外壳程序使用哪个可执行文件(在我们的示例中为awk )运行脚本。它还将-f (文件名)选项传递给awk ，从而通知它要处理的文本将来自文件。运行文件名时，我们会将其传递给脚本。

We’ve included the script below as text so you can cut and paste:

我们将以下脚本作为文本包含在内，因此您可以剪切和粘贴：

#!/usr/bin/awk -f

BEGIN {
  # set the input and output field separators
  FS=":"
  OFS=":"
  # zero the accounts counter
  accounts=0
}
{
  # set field 2 to nothing
  $2=""
  # print the entire line
  print $0
  # count another account
  accounts++
}
END {
  # print the results
  print accounts " accounts.\n"
}

Save this in a file called omit.awk. To make the script executabl e, we type the following using chmod:

将其保存在名为omit.awk的文件中。为了使脚本executabl ê ，我们键入以下使用chmod ：

chmod +x omit.awk

Now, we’ll run it and pass the /etc/passwd file to the script. This is the file awk will process for us, using the rules within the script:

现在，我们将运行它并将/etc/passwd文件传递到脚本。这是awk将使用脚本中的规则为我们处理的文件：

./omit.awk /etc/passwd

The file is processed and each line is displayed, as shown below.

处理文件并显示每一行，如下所示。

Output from the "./omit.awk /etc/passwd" in a terminal window.

The “x” entries in the second field were removed, but note the field separators are still present. The lines are counted and the total is given at the bottom of the output.

第二个字段中的“ x”条目已删除，但请注意，字段分隔符仍然存在。对行进行计数，并在输出的底部给出总计。

awk不代表尴尬 (awk Doesn’t Stand for Awkward)

awk doesn’t stand for awkward; it stands for elegance. It’s been described as a processing filter and a report writer. More accurately, it’s both of these, or, rather, a tool you can use for both of these tasks. In just a few lines, awk achieves what requires extensive coding in a traditional language.

awk不代表笨拙；它代表优雅。它被描述为处理过滤器和报告编写器。更准确地说，是这两者，或者是可以用于这两项任务的工具。在短短的几行中， awk实现了需要使用传统语言进行大量编码的功能。

That power is harnessed by the simple concept of rules that contain patterns, that select the text to process, and actions that define the processing.

这种力量是通过简单的规则概念来利用的，这些规则包含模式，选择要处理的文本以及定义处理的动作。

翻译自: https://www.howtogeek.com/562941/how-to-use-the-awk-command-on-linux/

linux命令 awk

culintai3473

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
linux命令 awk_如何在Linux上使用awk命令

linux命令 awkFatmawati Achmad Zaenuri/ShutterstockFatmawati Achmad Zaenuri / Shutterstock On Linux,awk is a command-line text manipulation dynamo, as well as a powerful scripting language. Here’s an in...
复制链接

扫一扫