shell 之 cut 命令

最新推荐文章于 2024-12-03 06:15:00 发布

首席撩妹指导官

最新推荐文章于 2024-12-03 06:15:00 发布

阅读量1.4k

点赞数

分类专栏： shell 文章标签： linux 服务器网络

本文链接：https://blog.csdn.net/qq_36864672/article/details/128882851

版权

shell 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

基本语法

其语法格式为：

cut [-bn] [file] 或 cut [-c] [file] 或 cut [-df] [file]

使用说明：

cut 命令从文件的每一行剪切字节、字符或字段并将这些字节、字符或字段写至标准输出。

如果不指定 file 参数， cut 命令将读取标准输入。必须指定 -b、-c 或 -f 标志之一。

主要参数含义： - -b: 以字节为单位进行分割 - -c: 以字符为单位进行分割 - -d: 自定义分隔符进行分割，默认为制表符号 - -f: 与 -d 一起使用，用分隔符分割后，指定显示哪些部分 - -n：取消分割多字节字符，仅和 -b 标志一起使用。

选取依据

接受三种选取定位方式

字节(bytes)定位，对应选项 -b
字符(characters)定位，对应选项 -c
域(fileds)，对应选项 -f

下面依次展示使用方法

以字节定位

当我们执行新建一个txt文件，命令为 who.txt 其内容如下

superStar console  May 27 11:05
superStar ttys000  Jun  9 18:41
superStar ttys012  May 27 11:07

cat who.txt 会输出以下类似的内容：

➜  shellLearn  cat who.txt
superStar console  May 27 11:05
superStar ttys000  Jun  9 18:41
superStar ttys012  May 27 11:07

截取每一行的第3个字符

我们可以执行 cat who.txt | cut -b 3，效果如下

➜  shellLearn  cat who.txt | cut -b 3
p
p
p

如果你还不懂 | 命令的含义，可以看这一篇文章神兵利刃：shell 初入门

截取每一行的第1至第9位，第11、第13位字符

-b 支持连续定位，连续定位比如第1第9位，可以简写为 1-9
-b 支持多个定位，多个定位用逗号隔开就行

➜  shellLearn  cat who.txt | cut -b 1-9,11,13
superStarcn
superStarty
superStarty

-b 定位会先将标志位从小到大排序再截取，打乱先后顺序再执行，效果和以上是一致的。

➜  shellLearn  cat who.txt | cut -b 11,13,1-9
superStarcn
superStarty
superStarty

截取每一行的第9位之前或之后的字符串

第9位之前（含第9位）, cut -b -9

➜  shellLearn  cat who.txt | cut -b -9
superStar
superStar
superStar

第9位之后（含第9位）, cut -b 9-

➜  shellLearn  cat who.txt | cut -b 9-
r console  May 27 11:05
r ttys000  Jun  9 18:41
r ttys012  May 27 11:07

可以看到以上都包含了第9位，如果执行 cut -b -9,9-：

➜  shellLearn  cat who.txt | cut -b -9,9-
superStar console  May 27 11:05
superStar ttys000  Jun  9 18:41
superStar ttys012  May 27 11:07

第9位并不会重复，会完整显示整行。

以字符定位

如果是文本内容都是纯单字节字符，字符定位和字节定位的效果是一致的。譬如：

➜  shellLearn  cat who.txt | cut -b 3
p
p
p
➜  shellLearn  cat who.txt | cut -c 3
p
p
p

新建一个内容为以下字符串的 week.txt。

星期一
星期二
星期三
星期四

对 cat week.txt 输出的流做截取，两个命令的区别就有了。 -b 会输出乱码， -c 会输出预期内的效果

➜  shellLearn  cat week.txt | cut -b 3
�
�
�
�
➜  shellLearn  cat week.txt | cut -c 3
一
二
三
四

因为 -c 以字符为单位，-b 只会憨憨的以字节为单位来计算，输出的是乱码

以字节定位，配合 `-n`

上述我们看到了， -b 当遇到多字节字符时，会显示乱码。但它还可以配合使用 -n 选项，-n 用于告诉 cut 不要将多字节字符拆开。

例子如下：

➜  shellLearn  cat week.txt | cut -b 1
�
�
�
�
➜  shellLearn  cat week.txt | cut -nb 1



➜  shellLearn  cat week.txt | cut -nb 1-3
星
星
星
星%

注：以上效果是在 Mac 上的效果， Linux 下可能会有区别，尤其是末尾的 % 不晓得是为啥

以域（fileds）定位

对于上面的格式，-b 或 -c 还游刃有余。但如果碰到 /etc/passwd 这样格式的内容，如下我们取倒数5行：

➜  shellLearn  cat /etc/passwd | tail -n5
_fpsd:*:265:265:FPS Daemon:/var/db/fpsd:/usr/bin/false
_timed:*:266:266:Time Sync Daemon:/var/db/timed:/usr/bin/false
_nearbyd:*:268:268:Proximity and Ranging Daemon:/var/db/nearbyd:/usr/bin/false
_reportmemoryexception:*:269:269:ReportMemoryException:/var/db/reportmemoryexception:/usr/bin/false
_driverkit:*:270:270:DriverKit:/var/empty:/usr/bin/false

它看似没有任何规律，但又是有规律的。它是由 : 拼接而成的字符串，冒号用来隔开某一项，但每一行内容长度差别极大。

如果我们想取第一个冒号之前的内容，第二个-第三个冒号之间的内容，我们怎么办呢。

这个时候，以域定位就派上用场了，简单来说，就是先设置间隔符，再设置提取第几个域，就好了。

➜  shellLearn  cat /etc/passwd | tail -n5 | cut -d : -f 1
_fpsd
_timed
_nearbyd
_reportmemoryexception
_driverkit
➜  shellLearn  cat /etc/passwd | tail -n5 | cut -d : -f 3
265
266
268
269
270

以上命令 -d 用来设置间隔符为冒号，然后又 -f 提取所需要的域。再回车，就出来了。开心吧。

-f 也支持域连续截取，域多个截取。示例如下：

➜  shellLearn  cat /etc/passwd | tail -n5 | cut -d : -f 1,3
_fpsd:265
_timed:266
_nearbyd:268
_reportmemoryexception:269
_driverkit:270
➜  shellLearn  cat /etc/passwd | tail -n5 | cut -d : -f 1,3-5
_fpsd:265:265:FPS Daemon
_timed:266:266:Time Sync Daemon
_nearbyd:268:268:Proximity and Ranging Daemon
_reportmemoryexception:269:269:ReportMemoryException
_driverkit:270:270:DriverKit
➜  shellLearn  cat /etc/passwd | tail -n5 | cut -d : -f -3
_fpsd:*:265
_timed:*:266
_nearbyd:*:268
_reportmemoryexception:*:269
_driverkit:*:270